Abstract
Migrating data into Evergreen can be one of the most daunting tasks for an administrator. This chapter will explain some procedures to help to migrate bibliographic records, copies and patrons into the Evergreen system. This chapter requires advanced ILS Administration experience, knowledge of Evergreen data structures, as well as knowledge of how to export data from your current system or access to data export files from your current system.
Table of Contents
One of the most important and challenging tasks is migrating your bibliographic records to a new system. The procedure may be different depending on the system from which you are migrating and the content of the marc records exported from the existing system. The procedures in this section deal with the process once the data from the existing system is exported into marc records. It does not cover exporting data from your existing non-Evergreen system.
Several tools for importing bibliographic records into Evergreen can be found in the Evergreen installation folder
(/home/opensrf/Evergreen-ILS-1.6.1.6/Open-ILS/src/extras/import/
) and are also available from the Evergreen repository
(
http://svn.open-ils.org/trac/ILS/browser/branches/rel_1_6_1/Open-ILS/src/extras/import).
If you are starting with MARC records from your existing system or another source, use the marc2bre.pl script to create the JSON representation of a bibliographic
record entry (hence bre) in Evergreen. marc2bre.pl
can perform the following functions:
Converts MARC-8
encoded records to UTF-8
encoding
Converts MARC21
to MARCXML21
Select the unique record number field (common choices are '035' or '001'; check your records as you might be surprised how a supposedly unique field actually has duplicates, though marc2bre.pl will select a unique identifier for subsequent duplicates)
Extracts certain pertinent fields for indexing and display purposes (along with the complete MARCXML21 record)
Sets the ID number of the first record from this batch to be imported into the biblio.record_entry table (hint - run the following
SQL
to determine what this number should be to avoid conflicts:
psql -U postgres evergreen
# SELECT MAX(id)+1 FROM biblio.record_entry;
If you are processing multiple sets of MARC records with marc2bre.pl before loading the records into the database, you will need to keep track
of the starting ID number for each subsequent batch of records that you are importing. For example, if you are processing three files of MARC records with 10000
records each into a clean database, you would use –startid 1
, –startid 10001
, and –startid 20001
parameters for each respective file.
Ignore “trash” fields that you do not want to retain in Evergreen
If you use marc2bre.pl
to convert your MARC records from the MARC-8
encoding to the UTF-8 encoding, it relies
on the MARC::Charset Perl
module to complete the conversion. When importing a large set of items, you can speed up the process by using a
utility like marc4j
or marcdumper
to convert the records
to MARC21XML
and UTF-8
before running them through marc2bre.pl with the
–marctype=XML
flag to tell marc2bre.pl that the records are already in MARC21XML
format with
the UTF-8
encoding. If you take this approach, due to a current limitation of MARC::File::XML
you have to do a
horrible thing and ensure that there are no namespace prefixes in front of the element names. marc2bre.pl
cannot parse the following
example:
<?xml version="1.0" encoding="UTF-8" ?> <marc:collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"> <marc:record> <marc:leader>00677nam a2200193 a 4500</marc:leader> <marc:controlfield tag="001">H01-0000844</marc:controlfield> <marc:controlfield tag="007">t </marc:controlfield> <marc:controlfield tag="008">060420s1950 xx 000 u fre d</marc:controlfield> <marc:datafield tag="040" ind1=" " ind2=" "> <marc:subfield code="a">CaOHCU</marc:subfield> <marc:subfield code="b">fre</marc:subfield> </marc:datafield> ... ;
But marc2bre.pl can parse the same example with the namespace prefixes removed:
<?xml version="1.0" encoding="UTF-8" ?> <collection xmlns:marc="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd"> <record> <leader>00677nam a2200193 a 4500</leader> <controlfield tag="001">H01-0000844</controlfield> <controlfield tag="007">t </controlfield> <controlfield tag="008">060420s1950 xx 000 u fre d</controlfield> <datafield tag="040" ind1=" " ind2=" "> <subfield code="a">CaOHCU</subfield> <subfield code="b">fre</subfield> </datafield> ... ;
Once you have your records in Open-ILS JSON
ingest format, you then need to use pg_loader.pl to convert these records into a
set of SQL
statements that you can use to
load the records into PostgreSQL. The –order
and –autoprimary
command line options (bre, mrd, mfr, etc) map to class IDs defined in
/openils/conf/fm_IDL.xml
.
One you have loaded the records into PostgreSQL, you can create metarecord entries in the metabib.metarecord table by running the following SQL
:
psql evergreen
# \i /home/opensrf/Evergreen-ILS-1.6*/src/extras/import/quick_metarecord_map.sql
Metarecords are required to place holds on items, among other actions.