You are on page 1of 44

Knowledge Mining with

Azure Search
2.5 quintillion
bytes per day
80%
of business relevant information is
unstructured
management free
keyword search
faceting
language analyzers
geospatial support
suggestions/auto-complete
customizable scoring
proximity search
synonyms
etc.
TIFF HTML


JPG
TIFF
Scott Guthrie
Title:
Executive Vice President,
C+E

Company: Microsoft

JPG

accent color: blue?


Cognitive Services capabilities
Infuse your apps, websites, and bots with human-like intelligence
A Quick Intro to Cognitive
Services
Demo
INGEST ENRICH EXPLORE

Cognitive skills

Data in any Search


format, any
Azure store

Annotations
Customer Annotated Search
Data Documents Index

INGEST ENRICH EXPLORE


Customer Annotated Search
Data Documents Index

INGEST ENRICH EXPLORE


Customer Annotated Search
Data Documents Index

INGEST ENRICH EXPLORE


Built-in skills Custom skills

Your custom
skill goes here!

Azure Azure Machine Machine Learning


Databricks Learning VMs
OCR (text
recognition)
Enriching from the Azure Portal
Demo
OCR (text
recognition)

handwritten
text recognition

face
detection cryptonym
extraction

face
detection redaction
classifier
{
{
"values": [ "values": [
{ {
"recordId": "7cad2", "recordId": "7cad2",
"data": "data":
{ {
"value1": "myOuput1":
"I owe you 5 grand" "Te debo cinco mil"
} }
}, },
{ Custom {
"recordId": "7cad3", translation "recordId": "7cad3",
skill
"data": "data":
{ https {
"value1": "myOutput1":
"Just my 2 cents", "Solo mis 2 centavos"
} }
}, },

] …
} ]
}
JFK and Wolters Kluwer
Demo
OCR (text
recognition)

handwritten
text recognition

face
detection cryptonym
extraction

face
detection redaction
classifier
/document
/content
/normalized_images
/1

/2

/…

/n
"skills": [

{
"@odata.type": "#Microsoft.Skills.Text.LanguageDetectionSkill",
"inputs":
[
{ "name": "text", "source": "/document/content" }
],
"outputs":
[
{ "name": "languageCode", "targetName": "myLanguageCode" },
{ "name": "languageName", "targetName": "myLanguageName" }
]
},
/document
/content
/normalized_images
/1

/2

/…

/n

/myLanguageCode
…,
{
"@odata.type": "#Microsoft.Skills.Text.NamedEntityRecognitionSkill",
"categories": [ "Organization" ],
"defaultLanguageCode": "en",
"inputs":
[
{ "name": "text", "source": "/document/content" },
"name" "languageCode" "source" "/document/myLanguageCode"
],
"outputs":
[
{ "name": "organizations", "targetName": "organizations" }
]
},
/document
/content /organizations
/normalized_images /1
/1
/2
/2
/…
/…
/n
/n

/mylanguagecode
…,
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"uri" "https://myskill.azurewebsites.net/api/OrgId"
"context": "/document/organizations/*" ,
"httpHeaders": {"Api-Key": "mySecret" },
"inputs":
[
{ "name": “organizationName", "source": "/document/organizations/*" },
],
"outputs":
[
{ "name": "organizationId", "targetName": "organizationId" }
]
},
/document
/content /organizations
/normalized_images /1 organizationId
/1
/2 organizationId
/2
/… organizationId
/…
/n organizationId
/n

/mylanguagecode
Customer Annotated Search
Data Documents Index

INGEST ENRICH EXPLORE


Option 1: Flatten the data
{
/document …
"outputFieldMappings":
[
/keyPhrases /organizations /images {
"sourceFieldName":
/0 /0 organizationId /0 tags "/document/organizations/*/organizationId",
"targetFieldName":
/1 /1 organizationId "myClients"
/1 tags
} ,
/… /… organizationId
/… tags …
]
/n /n organizationId
/n }tags
Option 2: Use Complex Types
 Coming soon!
/document
 In Private Preview
/keyPhrases /organizations /images  If interested, contact us at:
/0 /0 organizationId /0
azscustquestions@microsoft.com
tags

/1 /1 organizationId /1 tags

/… /… organizationId
/… tags

/n /n organizationId
/n tags
Use Case: Icertis
Trusted by the world’s top companies
AUTOMOTIVE PHARMA/HEALTH CARE SOFTWARE/TECHNOLOGY CONSULTING/SERVICES

MANUFACTURING/DIST RETAIL/CONSUMER BANKING/FINANCE ENERGY/ENGINEERING

2+ Million 5+ Million $500+ Billion 40+ 90+


Users Contracts in Contract Value Languages Countries
39

For the first time in history, contracts are


being digitized, allowing enterprises to fully
reimagine contract management.

AI is key to understanding the terabytes of


unstructured data in contracts
Understanding that can transform the
foundation of commerce
Icertis Contract Management (ICM)
Cognitive Search flow
Cognitive skills—custom and out of the box

Scenario
Architecture Icertis contract
management

1 2 3 4 5
Receive PDF Extract text Search for Spot risks, enrich Prep for
contract in email GDPR clauses data and search searchability
across languages
Customer Annotated Search
Data Documents Index

INGEST ENRICH EXPLORE


Try Cognitive Documentation JFK demo on Contact us
Search in the and Tutorials github
Azure Portal azscustquestions
http://aka.ms/jfkfiles @microsoft.com
Thank you

You might also like