Home » Tutorials »  Google Refine and NIFSTD

Google Refine and NIFSTD

The following tutorial explains how to work with Google Refine and NIFSTD (NIF standard ontology) to align terms in your data.

Step 1: Download Google Refine
http://code.google.com/p/google-refine/wiki/Downloads

Step 2: Load Google Refine

image 2


Step 3: Find your file and Create Project (upper right hand side)


image 4

Step 4: Find the column that you would like to refine (Note: it is very useful to have a duplicate column because refine will replace the text in the refined column). To start, click the column header and open the popup box.

image 6

image 8
Step 5: Select “start reconciling” data, and select the service that you want to reconcile to. To reconcile to NIFSTD ontology, use the Add Standard Service option (bottom left).

image 10

Step 6: Add service for NIF http://nif-services.neuinfo.org/ontoquest/reconcile

image 12

Step 7: Reconcile against no particular type (bottom middle button). Choose the column to reconcile.

image 14

Step 8: Select Start Reconciling and wait until Google Refine and the NIF service come to a reasonable agreement as to your data.

image 16

Step 9: Look at your data and export it back to Excel (or text). Google refine will allow you to align terms that are not aligned at this point by creating a new topic for all unaligned terms (just click on the two checkmarks to align all instances of your term in your data to some new label that you choose).

image 18



image 20


image 22

Step 10: Rejoice for you are (at least partially) reconciled with NIFSTD!


Trick to breaking the term and id:

  1. In Excel, copy the whole column
  2. Open Word, and paste special (plain text)
  3. Find all | and replace them with tab (^t)
  4. Copy the entire content of your Word doc
  5. Paste back into new column in Excel (should be smart enough to give you two columns, one for terms and one for the other ids).

NIF Statistics



NIF Version: 5.1

Ontology Version: 2.5

Level 2.5/3.0 Resources: 889

Registry Entries: 6,339

Total Records: 356,197,652





NIF Navigator