New dtSearch .NET Spider API

contents

software

New dtSearch .NET Spider API

dtSearch Corp. announces Version 7.2 of its product line for instantly searching terabytes of documents across a desktop, network, Internet or Intranet. The new version adds a .NET Spider API for the dtSearch Engine for Win & .NET, and updates the dtSearch Engine for Linux to the “terabyte indexer” code base. The new release also adds OpenOffice to the extensive list of supported file types.

All dtSearch products can index over a terabyte of text in a single index (as well as create and simultaneously search an unlimited number of indexes). Indexed search time is typically less than a second even across terabytes of data.

Along with the “terabyte indexer, ” all dtSearch products generally share the same feature set:
- over two dozen indexed, unindexed, full-text and fielded data search options
- display of HTML, XML and PDF files with highlighted hits and with embedded images, links and formatting intact
- built-in HTML converters for browser display of non-Web-ready content (word processor, database, spreadsheet, presentation, ZIP, CSV, Unicode, and other popular file types) with highlighted hits
- XML-based distributed searching, including integrated display of local and remote content

The dtSearch Spider provides:
- support for public sites, secure content HTTPS, password-accessible sites, and forms-based authentication
- searching of Web-based content to any specified level of horizontal or vertical depth
- support for dynamically-generated content (ASP.NET, MS CMS, SharePoint, etc.) as well as static content (HTML, XML, PDF, etc.)
- integrated relevancy-ranking of Spidered and non-Spidered content, including WYSWYG display of dynamic and static Web-ready content with highlighted hits

dtSearch Desktop with Spider instantly searches files on a PC (or on PC-accessible network drives). dtSearch Network with Spider searches across a network running in a client/server capacity. Both instantly search and display with highlighted hits a wide variety of content, including: email messages (Outlook, Outlook Express, Exchange, Eudora and other .MSG formats) along with the full text of email attachments, MS Office and now OpenOffice files, PDF, XML, HTML, ZIP, CSV, Unicode and other content. Through the dtSearch Spider, both applications can also add Web-based content to a local or network search.

dtSearch Web with Spider quickly publishes a large volume of instantly searchable data to an Internet or Intranet site. The Spider expands the scope of the searchable database beyond a site's own data to content on other sites. dtSearch Publish offers easy publishing of an instantly searchable document collection to CD, DVD, portable harddrive, and the like. The product can also mirror an existing Web site on CD/DVD. Both applications now add OpenOffice to the list of file types that the applications can publish (to the Web or to CD/DVD), instantly search, and display with highlighted hits.

The dtSearch Text Retrieval Engine lets developers add dtSearch search functionality to Web-based and other applications. The dtSearch Engine also provides developers access to dtSearch’s extensive file format support, including dtSearch’s WYSYWG hit-highlighted search display of Web-ready files, and proprietary built-in HTML converters for non-Web-ready files such as MS Office documents.

The dtSearch Engine for Win & .NET supports SQL, C++, Delphi, Java, C#, VB.NET, ASP.NET, C++.NET, and ADO.NET. The new release adds additional .NET APIs, including a new .NET Spider API, making the Spider functionality (described above) available for the first time through a .NET API. The new release also updates the dtSearch Engine for Linux, with C++ and Java APIs, to the current “terabyte indexer” code base. Finally, the new release adds OpenOffice to the list of file types that both versions of the dtSearch Engine can convert on-the-fly to HTML for display with highlighted hits.

The dtSearch product line generally offers over two dozen indexed, unindexed, fielded and full-text search options. These include: fuzziness adjustable from 0 to 10 (to sift through typographical and spelling errors), synonym/concept/thesaurus (both through a built-in thesaurus and through optional user-defined synonym rings), boolean (and/or/not), natural language relevancy ranking (by hit term frequency, density and rarity), positional scoring ranking, phrase, phonic, wildcard, bilateral proximity, directed proximity, stemming, numeric range, user-defined variable term weighting, and special forensics options. The dtSearch product line also provides international language support through Unicode, covering hundreds of international languages.

write your comments about the article :: © 2006 Computing News :: home page