Migrating Bibliographic Records Using the ESI Migration Tools

The following procedure explains how to migrate bibliographic records from marc records into Evergreen. This is a general guide and will need to be adjusted for your specific environment. It does not cover exporting records from specific proprietary ILS systems. For assistance with exporting records from your current system please refer to the manuals for your system or you might try to ask for help from the Evergreen community.

  1. Download the Evergreen migration utilities from the git repository.

    Use the command git clone git://git.esilibrary.com/git/migration-tools.git to clone the migration tools.

    Install the migration tools:

    
    
    cd migration-tools/Equinox-Migration
    perl Makefile.PL
    make
    make test
    make install
    
    
    
  2. Add environmental variables for migration and import tools. These paths must point to:

    • the import perl scripts bundled with Evergreen
    • the folder where you extracted the migration tools
    • the location of the Equinox-Migration perl modules
    • the location of the Evergreen perl modules (e.g. perl5)

    export PATH=[path to Evergreen]/Open-ILS/src/extras/import: \
    /[path to migration-tools]/migration-tools:$PATH:.
    export PERL5LIB=/openils/lib/perl5: \ 
    /[path to migration-tools/Equinox-Migration/lib
    
    
  3. Dump marc records into MARCXML using yaz-marcdump

    
    
    echo '<?xml version="1.0" encoding="UTF-8" ?>' > imported_marc_records.xml
    yaz-marcdump -f MARC-8 -t UTF-8 -o marcxml imported_marc_records.mrc >> imported_marc_records.xml
    
    
    
  4. Test validity of XML file using xmllint

    
    
    
     xmllint --noout imported_marc_records.xml 2> marc.xml.err
    
    
    
  5. Clean up the marc xml file using the marc_cleanup utility:

    
    marc_cleanup --marcfile=imported_marc_records.xml --fullauto [--renumber-from #] -ot 001
    	
    

    The --renumber-from is required if you have bibliographic records already in your system. Use this to set the starting id number higher then the last id in the biblio.record_entry table. The marc_cleanup command will generate a file called clean.marc.xml

  6. Create a fingerprinter file using the fingerprinter utility:

    
    fingerprinter -o incumbent.fp -x incumbent.ex clean.marc.xml
    			
    

    fingerprinter is used for deduplification of the incumbent records. The -o option specifies the output file and the -x option is used to specify the error output file.

  7. Create a fingerprinter file for existing Evergreen bibliographic records using the fingerprinter utility if you have existing bibliographic records in your system previously imported:

    
    fingerprinter -o production.fp -x production.fp.ex --marctype=MARC21 existing_marc_records.mrc \
    --tag=901 --subfield=c
    
    

    fingerprinter is used for deduplification of the incumbant records.

  8. Create a merged fingerprint file removing duplicate records.

    
    cat cat production.fp incumbent.fp | sort -r > dedupe.fp
    match_fingerprints [-t start id] -o records.merge dedupe.fp
    
    
  9. Create a new import XML file using the extract_loadset utility

    extract_loadset -l 1 -i clean.marc.xml -o merged.xml records.merge
    
  10. Extract all of the currently used TCN's an generate the .bre and .ingest files to prepare for the bibliographic record load.

    
    psql -U evergreen -c "select tcn_value from biblio.record_entry where not deleted" \
    | perl -npe 's/^\s+//;' > used_tcns
    marc2bre.pl --idfield 903 [--startid=#] --marctype=XML -f final.xml \
    --used_tcn_file=used_tcns > evergreen_bre_import_file.bre
    
    

    Note

    The option --startid needs to match the start id used in earlier steps and must be higher than largest id value in the biblio.record_entry table. the option --idfield should match the marc datafield used to store your records ids.

  11. Ingest the bibliographic records into the Evergreen database.

    
    
    parallel_pg_loader.pl \
    -or bre \
    -or mrd \
    -or mfr \
    -or mtfe \
    -or mafe \
    -or msfe \
    -or mkfe \
    -or msefe \
    -a mrd \
    -a mfr \
    -a mtfe \
    -a mafe \
    -a msfe \
    -a mkfe \
    -a msefe evergreen_bre_import_file.bre > bibrecords.sql
    
    
    
  12. Load the records using psql and the sql scripts generated from the previous step.

    
    
    psql -U evergreen -h localhost -d evergreen -f bibrecords.sql
    psql -U evergreen < ~/Ever*/Open-ILS/src/extras/import/quick_metarecord_map.sql > log.create_metabib
    
    
    
  13. Extract holdings from marc records for importing copies into Evergreen using the extract_holdings utility.

    
    extract_holdings --marcfile=clean.marc.xml --holding 999 --copyid 999i --map holdings.map
    
    

    This command would extract holdings based on the 949 datafield in the marc records. The copy id is generated from the subfile i in the 999 datafield. You may need to adjust these options based on the field used for holdings informatiom in your marc records.

    The map option holdings.map refers to a file to be used for mapping subfields to the holdings data you would like extracted. Here is an example based on mapping holdings data to the 999 data field:

    
    callnum 999 a
    barcode 999 i
    location 999 l
    owning_lib 999 m
    circ_modifier 999 t
    
    

    Running the extract holdings script should produce an sql script HOLDINGS.pg similar to:

    BEGIN;
    
    egid, hseq, l_callnum, l_barcode, l_location, l_owning_lib, l_circ_modifier,
    40      0       HD3616.K853 U54 1997    30731100751928  STACKS  FENNELL BOOK
    41      1       HV6548.C3 S984 1998     30731100826613  STACKS  FENNELL BOOK
    41      2       HV6548.C3 S984 1998     30731100804958  STACKS  BRANTFORD       BOOK
    ...
    

    Edit the holdings.pg sql script like so:

    BEGIN;
    
    TRUNCATE TABLE staging_items;
    
    INSERT INTO staging_items (egid, hseq, l_callnum, l_barcode, l_location, 
    l_owning_lib, l_circ_modifier) FROM stdin; 
    40      0       HD3616.K853 U54 1997    30731100751928  STACKS  FENNELL BOOK
    41      1       HV6548.C3 S984 1998     30731100826613  STACKS  FENNELL BOOK
    41      2       HV6548.C3 S984 1998     30731100804958  STACKS  BRANTFORD       BOOK
    \.
    
    COMMIT;
    

    This file can be used for importing holdings into Evergreen. the egid is a critical column. It is used to link the volume and copy to the bibliographic record. Please refer to for the steps to import your holdings into Evergreen.