A Journal of Playing with Hydra Data
This part is based on the dive-into-hydra tutorial project.
What should I do to immigrate Hydra Data (Fedora files, Solr documents and indexes, etc) from one Hydra project to another project?
To figure out a way, I have explored following items:
Step 1: SQLite3
My first idea is to check database. Tutorial project uses SQLite3 as database. All files are included in
/db folder. Before going to production, you may get
SQLiteStudio to open these two SQLite3 databases. Each database has four tables, which are:
bookmarks: save information when a bookmark is created. Sample Fields:
schema_migrations: version of query schema
searches: save information when a search operation is performed. Sample Fields:
users: user information. Sample Fields:
It seems that database(SQLite3) here is used to record information in search process, after all previous data has been immigrated.
Now my question is: Does Hydra have a solution?
Step 2: Hydra
I did some research about a full immigration solution in Hydra. I didn't find any result, which is nature because Hydra includes so many components: Fedora, Solr, Blacklight, etc. It would be extremely hard to transfer all data from these components once, without any error.
After I re-think my requirement, I realize this is what I want: Get all data from Fedora in old project -> Put it to Fedora in new project, with same structure -> Solr in new project re-indexing -> Blacklight can find new documents
Now I want to start the process of export and import Fedora data.
Step 3: Fedora
Fedora4 provides a RESTful HTTP API for data processing.
To export serialized form of a resource, with all subtress and binary content, run following in your terminal (I adjust path based on my project):
curl -u fedoraAdmin:fedoraAdmin http://127.0.0.1:8983/fedora/dev/fcr:export?recurse=true&skipBinary=false"
Make sure to use your own fedora username and password.
This command will generate a group of
xml code, save it to a
xml file, such as
Now go to your new project and get a copy of your exported xml file. Run following command to import:
curl -u fedoraAdmin:fedoraAdmin -X POST --data-binary "@FedoraExport.xml" "http://127.0.0.1:8983/fedora/rest/dev/fcr:import?format=jcr/xml"
Now restart jetty and navigate to
http://127.0.0.1:8983/fedora/rest/dev. I'll be able to see all data, includes all subtrees, in the Fedora of new project.
I also noticed Fedora UI can import and export xml files. However, there is no recursive export feature there.
Problem: When I followed tutorial project, I made all the book nodes under /fedora/rest/dev. When I did export, I had to export dev node to make sure it includes book childnodes. After I imported dev node to new /fedora/rest/dev, I realized I had two dev nodes now. Another solution is to replace new fedora's dev node. However after I imported dev node directly to /fedora/rest, the new dev node will show as 'dev%5B2%5D' instead of 'dev'.
Possible Solution: I should create a summary node, which includes all my books data. I export this summary node and import it to /fedora/rest/dev. Now I can have exactly same structure as old fedora.
Now I want to start the process of Solr re-indexing.
Step 4: Solr
Working in process :)