This article is more than 5 years old.
Today Erik and Audra attended a webex session from the Internet Archive on new features in ArchiveIT 4.0. They had me from the first few minutes when they announced that this year had been named the ‘year of metadata’ at the Internet Archive!
They focused on new features including metadata searching, crawl date limiting, and improved video crawling and streaming.
They also have enhanced their reporting features, specifically introducing a URL report that shows exactly what URLs got archived during a given crawl. They also introduced a number of automatic metadata harvesting features during the seed assignment process and some new features to scope-it that helps you set constraints on specific hosts.
One interesting metadata feature they introduced was the ability to export metadata records for archived items to both MARC and MODS. I thought this was an interesting concept as a way to leverage archived content in local indexes or webservices. They also introduced a third party tool called ProxyToggle, a Firefox plug-in that helps do quality control testing on archived content.
3 Comments on ‘ArchiveIT 4.0 training’
Thanks for posting this, Erik!
Here are the 4.0 release notes: https://webarchive.jira.com/wiki/display/ARIH/Archive-It+4.0+Release+Notes
Here’s the page about exporting to MARC: https://webarchive.jira.com/wiki/display/ARIH/Exporting+Archive-It+metadata+and+converting+it+to+MARC
Thanks Erik and Audra for keeping us up to snuff on AT. I, for one, believe AT is the ‘thing’ that will help us as a unit become more organized and efficient. Your work on this helps us get closer to that goal! Great work.
Sounds like there are some big improvements in 4.0. Glad to hear there is more specific crawl control and that you can tell what URL is crawled. Thanks!