Ingestion / Running Engines

Q: How do users ingest data into Illuminate and run cognitive engines?

A: Users ingest and run cognitive engines against their data directly in the CMS application by using file upload or built-in ingestion adapters, such as SFTP, Dropbox, Amazon S3, etc.

Additionally, they can organize their data into folders in CMS. These folders and all imported data are then automatically made available in Illuminate, visible on the sidebar at the left side of the screen. Note, the user must create the folder structure manually prior to ingestion. 

Illuminate users may search against the output of any of the cognitive engine in aiWARE.

Q: What file formats does Illuminate support?

A: Veritone Illuminate supports all of the data formats that aiWARE currently supports, including:

  • Audio file types (e.g. wav, mp3)

  • Video file types (e.g. mp4, wmv)

  • Text-based documents (e.g. Word, Excel, Powerpoint, PDF) 

Of course, certain engines can only be run against certain types of files. See below for an enumeration of the file types supported, ordered by engine type.

Text Extraction

  • doc, docx, txt, pdf, eml, ppt 

  • Roadmap: msg, rtf


Audio and Video file formats - must be able to open with general media players (ie. unencrypted and non-proprietary codecs)


Audio (must be transcribed first), Video (must be transcribed first), doc, docx, txt, pdf, eml, ppt 

Object Detection

  • Video, images (jpeg, png, etc.)

  • Video OCR 


  • Facial Detection/Recognition 

Unsupported: pst, zip (archives)

Q: How are files identified in Illuminate?

A: Prior to be exposed in Illuminate, every file ingested into aiWARE is assigned a unique document identifier consisting of a series of digits. The document identifier is referred to as the media ID. The media ID is automatically assigned and there is no control such as the ability to add special prefix/suffix.

Customers might ask if they can create custom document identifiers. Currently, aiWARE does not support this capability.

Accessing & Using Illuminate

Q: How do users access Illuminate?

A: Users log into aiWARE and select Veritone Illuminate from the App Switcher - the same method used to access Discovery, CMS, Redact, Identify and Attribute.

Q: What’s a common workflow that a user might undergo in Illuminate?

A: Users will start by ingesting their data via CMS and running the cognitive engines that they desire. From there, users will access Illuminate via the App Switcher. Once in Illuminate, users can then find what’s relevant in their data through search and exploration (via the text analytics visualization). Then, users can tag their data so that it’s easily identifiable in the future, send to Redact, and export.

High level: CMS -> Illuminate -> Search & Explore -> Tag / Send to Redact / Export

Q: How does the text analytics “sunburst” work?

A: The sunburst diagram visualizes the entities (i.e. - people, organizations, locations, etc.) that the text analytics engine automatically identifies. Currently, this capability will only identify entities across audio and video transcripts (and not across text-based documents, faces or objects, etc.). 


Q: How and where does Illuminate export data?

A: Illuminate currently supports exporting data to an end user-supplied Amazon S3 environment, but we’ll add additional destinations in the future. To initiate an export, the user selects one, multiple, or all items in Illuminate’s table view and then clicks the export icon (a cloud) that’s located in the table’s header row. Export is handled entirely behind-the-scenes, and once the export job has successfully completed, the user will receive an email with a link to download the exported files from S3.

Q: What is included in an export file?

A: Files that are exported from Illuminate are contained in a single zip file. Within that zip file are:

  • A folder that contains all exported media assets (audio, video, text-based documents)

  • A folder that contains all json-formatted time-correlated transcripts related to those media assets

  • A pipe-delimited text file that serves as an index, and associates the exported media assets with their corresponding transcripts

Q: Can Illuminate export to Relativity? OpenText? Others?

A: The export file that Illuminate produces is generic enough to be ingested by many common legal review platforms such as Relativity and OpenText, but we do not have a direct, automatic integration with any legal review platform. Users can manually upload the exported files into their desired destination.


Q: Where can Veritone Illuminate be deployed?

A: Currently, Veritone Illuminate is available to customers via our US and UK instances of AWS Commercial Cloud.

Did this answer your question?