Due to technical or privacy issues, archived websites may not be exact copies of the original website at the time of the web crawl. Certain file types will not be captured dependent on how they are embedded in the site. This can include videos (including Youtube and Vimeo), pdfs (including Scribd or another pdf reader), rss feeds/plug-ins (including Twitter), commenting platforms (disqus, Facebook), presi, images, or anything that is not native to the site. Other parts of websites that the crawler has difficulty capturing includes Javascript, streaming content, database-driven content, and highly interactive content. Robot txt exclusions also change the formatting of websites from their original display, websites with robot txt exclusions often display as an index.
Keyword and file-type search are also available to archived websites through the Web Archiving Service (WAS). To search for keywords within archived websites, click a site name link in the container list. From that page, click the "search" tab on the top left corner. From the search page, you can search by keyword, URL, and file type across all captures of one site or all the sites in the project.