{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d3","createdAt":"2015-07-07T21:33:32.966Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":10,"body":"One of the very first steps necessary in processing text is to break the text apart into tokens and to group those tokens into sentences. We use the word \"tokens\" and not \"words\" because tokens can also be things like:\n\n* punctuation (exclamation points affect sentiment, for instance)\n* links (http://...)\n* possessive markers\n* and the like.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Language Approach\"\n}\n[/block]\nFor most European languages, tokenization is fairly straightforward - look for white space, look for punctuation, and the like. Each language has its own features, though - for instance, German makes extensive use of compound words and for some purposes such as sentiment it can be worth it to tokenize to the sub-word level. \n\nSome languages, such as Chinese, have no space breaks between words and tokenizing those languages requires the use of more sophisticated statistical models. Lexalytics has developed tokenization models for all of our supported languages.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Parts of Speech\"\n}\n[/block]\nMost NLP and text mining tools make use not just of a bucket of tokens but also the parts of speech. Knowing what part of speech a token is makes it more useful. Proper nouns (Lexalytics) are more likely to be a mention of person, place, or company, adjectives (terrible) are more likely to be sentiment phrases, and so on. In most languages, single words can be of multiple speech types depending on context - \"Love makes the world go round\" has \"love\" as a noun, while \"I love NLP\" has love as a verb. Determining the part of speech for a token requires evaluating the context the word appears in. \n\nLexalytics has developed POS tagging models for most of its supported languages, and returns POS tags along with the text output if desired. Our set of POS tags is an extension of the Penn Treebank set of POS tags.","excerpt":"","slug":"feature-1","type":"basic","title":"Tokenization and POS Tagging","__v":0,"childrenPages":[]}

Tokenization and POS Tagging


One of the very first steps necessary in processing text is to break the text apart into tokens and to group those tokens into sentences. We use the word "tokens" and not "words" because tokens can also be things like: * punctuation (exclamation points affect sentiment, for instance) * links (http://...) * possessive markers * and the like. [block:api-header] { "type": "basic", "title": "Language Approach" } [/block] For most European languages, tokenization is fairly straightforward - look for white space, look for punctuation, and the like. Each language has its own features, though - for instance, German makes extensive use of compound words and for some purposes such as sentiment it can be worth it to tokenize to the sub-word level. Some languages, such as Chinese, have no space breaks between words and tokenizing those languages requires the use of more sophisticated statistical models. Lexalytics has developed tokenization models for all of our supported languages. [block:api-header] { "type": "basic", "title": "Parts of Speech" } [/block] Most NLP and text mining tools make use not just of a bucket of tokens but also the parts of speech. Knowing what part of speech a token is makes it more useful. Proper nouns (Lexalytics) are more likely to be a mention of person, place, or company, adjectives (terrible) are more likely to be sentiment phrases, and so on. In most languages, single words can be of multiple speech types depending on context - "Love makes the world go round" has "love" as a noun, while "I love NLP" has love as a verb. Determining the part of speech for a token requires evaluating the context the word appears in. Lexalytics has developed POS tagging models for most of its supported languages, and returns POS tags along with the text output if desired. Our set of POS tags is an extension of the Penn Treebank set of POS tags.
One of the very first steps necessary in processing text is to break the text apart into tokens and to group those tokens into sentences. We use the word "tokens" and not "words" because tokens can also be things like: * punctuation (exclamation points affect sentiment, for instance) * links (http://...) * possessive markers * and the like. [block:api-header] { "type": "basic", "title": "Language Approach" } [/block] For most European languages, tokenization is fairly straightforward - look for white space, look for punctuation, and the like. Each language has its own features, though - for instance, German makes extensive use of compound words and for some purposes such as sentiment it can be worth it to tokenize to the sub-word level. Some languages, such as Chinese, have no space breaks between words and tokenizing those languages requires the use of more sophisticated statistical models. Lexalytics has developed tokenization models for all of our supported languages. [block:api-header] { "type": "basic", "title": "Parts of Speech" } [/block] Most NLP and text mining tools make use not just of a bucket of tokens but also the parts of speech. Knowing what part of speech a token is makes it more useful. Proper nouns (Lexalytics) are more likely to be a mention of person, place, or company, adjectives (terrible) are more likely to be sentiment phrases, and so on. In most languages, single words can be of multiple speech types depending on context - "Love makes the world go round" has "love" as a noun, while "I love NLP" has love as a verb. Determining the part of speech for a token requires evaluating the context the word appears in. Lexalytics has developed POS tagging models for most of its supported languages, and returns POS tags along with the text output if desired. Our set of POS tags is an extension of the Penn Treebank set of POS tags.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d4","createdAt":"2015-07-07T21:33:50.928Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":11,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Our Sentiment Values\"\n}\n[/block]\nLexalytics was the first vendor to ship a commercial sentiment engine, way back in 2003. Lexalytics measures sentiment from negative to positive, and returns a score running from negative to positive. The bigger the number, the more negative or positive the text is. It is up you to interpret that score - do you want to cut it into a three-point scale (negative, neutral, positive), a five-point scale (1 to 5), or, as some PR agencies do, \"negative\" and \"everything else?\"\n\nOur Semantria product returns both the score and a three-point scale of negative, neutral and positive based on a default neutral range of -0.05 to +0.22. You can use our default sentiment       range, or you can try your own. \n\nSentiment applies to just about everything. A sentiment score is returned for:\n\n* a whole document\n* each entity\n* each query and category\n* each theme\n\nThis allows you to drill deeper into the discussion - is the theme of \"customer service\" positive or negative? What is the distribution of sentiment for the query \"check in\" OR \"front desk\" in my data set?\n\nIn addition to the overall score, we returned the individual phrases that were found, along with their scores, and a lot more information on each one.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"How is it calculated?\"\n}\n[/block]\nLexalytics offers two methods of calculating sentiment. One works out of the box and is called phrase-based sentiment. The other is called model-based sentiment and requires you to build a machine learning model off your data to generate sentiment. Phrase based sentiment works well for most content sets and is easily configurable by a non-expert.\n\n**The short answer:**\n\nWe find sentiment words in a clever way and add up their scores.\n\n**Long answer:**\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Phrase-based Sentiment\"\n}\n[/block]\nPhrase-based sentiment builds on the POS tagging Lexalytics performs. When calculating sentiment for a document, the POS for words and phrases are evaluated against POS patterns and those that match the allowable patterns are then looked up in a large dictionary of pre-scored words and phrases. Why do we need POS patterns? Because a word might be sentiment bearing when it is one POS and non sentiment bearing in another. For instance, \"love\" is positive as a verb (\"I love NLP\"), but not positive when its part of a name (Courtney Love).\n\nThe pre-scored dictionary of words and phrases has scores ranging from -1 (always strongly negative) to +1 (always strongly positive) but most words and phrases are not at the ends, because many words can be more or less strong in different contexts. \n\n**Modifiers**\nOnce we have identified the sentiment words and gotten the scores from the dictionary, we look for modifiers to those phrases. The most common modifiers are negators (not, never) and intensifiers (very, somewhat). These multiply the score of the phrase. A negator generally flips the sentiment score to the reverse of the dictionary value, while an intensifier might increase the total score (very) or decrease it (somewhat). Additional non-obvious intensifiers include things like comparative clauses - we weight the conclusion of a clause more heavily than the beginning, and we also try to eliminate boilerplate expressions. In an expression like \"The food was bland, but we had a good time anyways\", we de-weight \"bland\" while unweighting \"good time\".  \"Good morning\" is not really sentiment bearing on its own, its just a standard greeting, while \"I had a really good morning today\" is sentiment bearing. Phrases we think are boilerplate have their sentiment scores heavily de-weighted. \n\nAt this point we have a list of words with modified scores, and we then add up the scores, weighting the total score by the number of words. \n\nAll of this applies to whatever the text is - a whole document, an entity mention, a theme, and so on. For anything smaller than the whole document, we create a type of summary called a lexical chain, and calculate the sentiment based on that. \n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Model-based Sentiment\"\n}\n[/block]\nModel-based sentiment is a technique of training a machine learning model on a set of texts that have been scored as negative, neutral or positive by humans. If you have your own set of data already classified by humans into those buckets, then building a model might be a good way for you to get sentiment tuned to your needs.\n\nModel sentiment applies only to the document level, and reports probabilities for the document to fall into negative, neutral or positive. If you need sentiment for entities or queries, you will need phrase-based sentiment instead of model sentiment.","excerpt":"","slug":"sentiment","type":"basic","title":"Sentiment","__v":0,"childrenPages":[]}

Sentiment


[block:api-header] { "type": "basic", "title": "Our Sentiment Values" } [/block] Lexalytics was the first vendor to ship a commercial sentiment engine, way back in 2003. Lexalytics measures sentiment from negative to positive, and returns a score running from negative to positive. The bigger the number, the more negative or positive the text is. It is up you to interpret that score - do you want to cut it into a three-point scale (negative, neutral, positive), a five-point scale (1 to 5), or, as some PR agencies do, "negative" and "everything else?" Our Semantria product returns both the score and a three-point scale of negative, neutral and positive based on a default neutral range of -0.05 to +0.22. You can use our default sentiment range, or you can try your own. Sentiment applies to just about everything. A sentiment score is returned for: * a whole document * each entity * each query and category * each theme This allows you to drill deeper into the discussion - is the theme of "customer service" positive or negative? What is the distribution of sentiment for the query "check in" OR "front desk" in my data set? In addition to the overall score, we returned the individual phrases that were found, along with their scores, and a lot more information on each one. [block:api-header] { "type": "basic", "title": "How is it calculated?" } [/block] Lexalytics offers two methods of calculating sentiment. One works out of the box and is called phrase-based sentiment. The other is called model-based sentiment and requires you to build a machine learning model off your data to generate sentiment. Phrase based sentiment works well for most content sets and is easily configurable by a non-expert. **The short answer:** We find sentiment words in a clever way and add up their scores. **Long answer:** [block:api-header] { "type": "basic", "title": "Phrase-based Sentiment" } [/block] Phrase-based sentiment builds on the POS tagging Lexalytics performs. When calculating sentiment for a document, the POS for words and phrases are evaluated against POS patterns and those that match the allowable patterns are then looked up in a large dictionary of pre-scored words and phrases. Why do we need POS patterns? Because a word might be sentiment bearing when it is one POS and non sentiment bearing in another. For instance, "love" is positive as a verb ("I love NLP"), but not positive when its part of a name (Courtney Love). The pre-scored dictionary of words and phrases has scores ranging from -1 (always strongly negative) to +1 (always strongly positive) but most words and phrases are not at the ends, because many words can be more or less strong in different contexts. **Modifiers** Once we have identified the sentiment words and gotten the scores from the dictionary, we look for modifiers to those phrases. The most common modifiers are negators (not, never) and intensifiers (very, somewhat). These multiply the score of the phrase. A negator generally flips the sentiment score to the reverse of the dictionary value, while an intensifier might increase the total score (very) or decrease it (somewhat). Additional non-obvious intensifiers include things like comparative clauses - we weight the conclusion of a clause more heavily than the beginning, and we also try to eliminate boilerplate expressions. In an expression like "The food was bland, but we had a good time anyways", we de-weight "bland" while unweighting "good time". "Good morning" is not really sentiment bearing on its own, its just a standard greeting, while "I had a really good morning today" is sentiment bearing. Phrases we think are boilerplate have their sentiment scores heavily de-weighted. At this point we have a list of words with modified scores, and we then add up the scores, weighting the total score by the number of words. All of this applies to whatever the text is - a whole document, an entity mention, a theme, and so on. For anything smaller than the whole document, we create a type of summary called a lexical chain, and calculate the sentiment based on that. [block:api-header] { "type": "basic", "title": "Model-based Sentiment" } [/block] Model-based sentiment is a technique of training a machine learning model on a set of texts that have been scored as negative, neutral or positive by humans. If you have your own set of data already classified by humans into those buckets, then building a model might be a good way for you to get sentiment tuned to your needs. Model sentiment applies only to the document level, and reports probabilities for the document to fall into negative, neutral or positive. If you need sentiment for entities or queries, you will need phrase-based sentiment instead of model sentiment.
[block:api-header] { "type": "basic", "title": "Our Sentiment Values" } [/block] Lexalytics was the first vendor to ship a commercial sentiment engine, way back in 2003. Lexalytics measures sentiment from negative to positive, and returns a score running from negative to positive. The bigger the number, the more negative or positive the text is. It is up you to interpret that score - do you want to cut it into a three-point scale (negative, neutral, positive), a five-point scale (1 to 5), or, as some PR agencies do, "negative" and "everything else?" Our Semantria product returns both the score and a three-point scale of negative, neutral and positive based on a default neutral range of -0.05 to +0.22. You can use our default sentiment range, or you can try your own. Sentiment applies to just about everything. A sentiment score is returned for: * a whole document * each entity * each query and category * each theme This allows you to drill deeper into the discussion - is the theme of "customer service" positive or negative? What is the distribution of sentiment for the query "check in" OR "front desk" in my data set? In addition to the overall score, we returned the individual phrases that were found, along with their scores, and a lot more information on each one. [block:api-header] { "type": "basic", "title": "How is it calculated?" } [/block] Lexalytics offers two methods of calculating sentiment. One works out of the box and is called phrase-based sentiment. The other is called model-based sentiment and requires you to build a machine learning model off your data to generate sentiment. Phrase based sentiment works well for most content sets and is easily configurable by a non-expert. **The short answer:** We find sentiment words in a clever way and add up their scores. **Long answer:** [block:api-header] { "type": "basic", "title": "Phrase-based Sentiment" } [/block] Phrase-based sentiment builds on the POS tagging Lexalytics performs. When calculating sentiment for a document, the POS for words and phrases are evaluated against POS patterns and those that match the allowable patterns are then looked up in a large dictionary of pre-scored words and phrases. Why do we need POS patterns? Because a word might be sentiment bearing when it is one POS and non sentiment bearing in another. For instance, "love" is positive as a verb ("I love NLP"), but not positive when its part of a name (Courtney Love). The pre-scored dictionary of words and phrases has scores ranging from -1 (always strongly negative) to +1 (always strongly positive) but most words and phrases are not at the ends, because many words can be more or less strong in different contexts. **Modifiers** Once we have identified the sentiment words and gotten the scores from the dictionary, we look for modifiers to those phrases. The most common modifiers are negators (not, never) and intensifiers (very, somewhat). These multiply the score of the phrase. A negator generally flips the sentiment score to the reverse of the dictionary value, while an intensifier might increase the total score (very) or decrease it (somewhat). Additional non-obvious intensifiers include things like comparative clauses - we weight the conclusion of a clause more heavily than the beginning, and we also try to eliminate boilerplate expressions. In an expression like "The food was bland, but we had a good time anyways", we de-weight "bland" while unweighting "good time". "Good morning" is not really sentiment bearing on its own, its just a standard greeting, while "I had a really good morning today" is sentiment bearing. Phrases we think are boilerplate have their sentiment scores heavily de-weighted. At this point we have a list of words with modified scores, and we then add up the scores, weighting the total score by the number of words. All of this applies to whatever the text is - a whole document, an entity mention, a theme, and so on. For anything smaller than the whole document, we create a type of summary called a lexical chain, and calculate the sentiment based on that. [block:api-header] { "type": "basic", "title": "Model-based Sentiment" } [/block] Model-based sentiment is a technique of training a machine learning model on a set of texts that have been scored as negative, neutral or positive by humans. If you have your own set of data already classified by humans into those buckets, then building a model might be a good way for you to get sentiment tuned to your needs. Model sentiment applies only to the document level, and reports probabilities for the document to fall into negative, neutral or positive. If you need sentiment for entities or queries, you will need phrase-based sentiment instead of model sentiment.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d5","createdAt":"2015-07-07T21:34:07.286Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":12,"body":"Categorization is the process of putting documents into buckets of some kind. For NLP purposes, usually the buckets are the subjects you are looking to analyze. Those categories might be very broad, like newspaper sections (sports/world/art, etc) or they might be very focussed (the different substrates in silicon wafer manufacturing processes).\n\nThere are many ways to categorize documents. Different techniques are better suited to particular categorization needs. Lexalytics provides a variety of ways to go about categorizing your documents.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Queries\"\n}\n[/block]\nQueries are saved Boolean searches that are run against all the documents you submit. If the search hits the document, we return the name of the query, the hit count, and the sentiment of the query. Queries are very precise and transparent. You can see exactly what you searched for and why it hit. Only what you search for will trigger a hit. This makes query-based classification a good fit when the bucket is easy to define. For instance,if you are looking for all documents where someone talks about their iPhone, you can easily write a query to find that.\n\nThe downside to using queries for categorization is that for buckets that represent more of a concept it can be difficult and time-consuming to construct a query that fits. You have to worry about synonyms, ambiguous terms, and so on. For instance, if you are looking for all documents about mobile technology, you need to search for the various types of mobile technologies, exclude those documents that talk about mobility and technology separately and so on.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Categories\"\n}\n[/block]\nCategories are saved searches constructed using our Concept Matrix language. They are run against all documents you submit. Each category is evaluated and given a percentage relevance score against the document. If the score is higher than your category threshold, we return the category name, the relevancy score, and the sentiment of the category. Category queries use our Concept Matrix to extend the terms you submit. For instance, a category of \"food\" will score well on a document such as \"I had some chicken wings the other day that were just awesome.\" This makes categories a good choice to match broader buckets, such as sports, food, art, or technology.\n\nThe downside to using categories for categorization is that it is not transparent. You will hit documents that do not have any of your terms in them, and it will be difficult to tell exactly why they hit. In addition, categories do not work well for very short content such as tweets, because categories need context to work with.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Auto categories\"\n}\n[/block]\nAuto categories are categories that we built for you that match the Wikipedia taxonomy. There are about 4,000 entries and the taxonomy is three levels deep. The taxonomy and the categories are not user-modifiable. If one of the categories hits your document, you will receive back the name of the category along with the sentiment, the ancestor nodes of the category, the score, and the URL to the Wikipedia page representing the category.\n\nAuto categories provide mainly broad subject areas such as Agriculture, Physics, Computers and so on, although some of the leaves are fairly granular. For a complete list, see <here>\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Machine Learning\"\n}\n[/block]\nMachine learning is a categorization technique where instead of the user writing queries the user assembles a set of documents each tagged with the appropriate bucket. Once you have a set of data, a machine learning process is run to create a statistical model from the documents. For instance, if you tagged lots of documents with the bucket \"Mobile Phones\" and most of them had the word iPhone in them, the machine will learn that iPhone is closely related to the bucket Mobile Phones. When it sees a new document with the word iPhone in it, it will give a high confidence score for the category of Mobile Phones to that document.\n\nLexalytics supports various machine learning models. They are not supported in Semantria at present.","excerpt":"","slug":"feature-3","type":"basic","title":"Categorization","__v":0,"childrenPages":[]}

Categorization


Categorization is the process of putting documents into buckets of some kind. For NLP purposes, usually the buckets are the subjects you are looking to analyze. Those categories might be very broad, like newspaper sections (sports/world/art, etc) or they might be very focussed (the different substrates in silicon wafer manufacturing processes). There are many ways to categorize documents. Different techniques are better suited to particular categorization needs. Lexalytics provides a variety of ways to go about categorizing your documents. [block:api-header] { "type": "basic", "title": "Queries" } [/block] Queries are saved Boolean searches that are run against all the documents you submit. If the search hits the document, we return the name of the query, the hit count, and the sentiment of the query. Queries are very precise and transparent. You can see exactly what you searched for and why it hit. Only what you search for will trigger a hit. This makes query-based classification a good fit when the bucket is easy to define. For instance,if you are looking for all documents where someone talks about their iPhone, you can easily write a query to find that. The downside to using queries for categorization is that for buckets that represent more of a concept it can be difficult and time-consuming to construct a query that fits. You have to worry about synonyms, ambiguous terms, and so on. For instance, if you are looking for all documents about mobile technology, you need to search for the various types of mobile technologies, exclude those documents that talk about mobility and technology separately and so on. [block:api-header] { "type": "basic", "title": "Categories" } [/block] Categories are saved searches constructed using our Concept Matrix language. They are run against all documents you submit. Each category is evaluated and given a percentage relevance score against the document. If the score is higher than your category threshold, we return the category name, the relevancy score, and the sentiment of the category. Category queries use our Concept Matrix to extend the terms you submit. For instance, a category of "food" will score well on a document such as "I had some chicken wings the other day that were just awesome." This makes categories a good choice to match broader buckets, such as sports, food, art, or technology. The downside to using categories for categorization is that it is not transparent. You will hit documents that do not have any of your terms in them, and it will be difficult to tell exactly why they hit. In addition, categories do not work well for very short content such as tweets, because categories need context to work with. [block:api-header] { "type": "basic", "title": "Auto categories" } [/block] Auto categories are categories that we built for you that match the Wikipedia taxonomy. There are about 4,000 entries and the taxonomy is three levels deep. The taxonomy and the categories are not user-modifiable. If one of the categories hits your document, you will receive back the name of the category along with the sentiment, the ancestor nodes of the category, the score, and the URL to the Wikipedia page representing the category. Auto categories provide mainly broad subject areas such as Agriculture, Physics, Computers and so on, although some of the leaves are fairly granular. For a complete list, see <here> [block:api-header] { "type": "basic", "title": "Machine Learning" } [/block] Machine learning is a categorization technique where instead of the user writing queries the user assembles a set of documents each tagged with the appropriate bucket. Once you have a set of data, a machine learning process is run to create a statistical model from the documents. For instance, if you tagged lots of documents with the bucket "Mobile Phones" and most of them had the word iPhone in them, the machine will learn that iPhone is closely related to the bucket Mobile Phones. When it sees a new document with the word iPhone in it, it will give a high confidence score for the category of Mobile Phones to that document. Lexalytics supports various machine learning models. They are not supported in Semantria at present.
Categorization is the process of putting documents into buckets of some kind. For NLP purposes, usually the buckets are the subjects you are looking to analyze. Those categories might be very broad, like newspaper sections (sports/world/art, etc) or they might be very focussed (the different substrates in silicon wafer manufacturing processes). There are many ways to categorize documents. Different techniques are better suited to particular categorization needs. Lexalytics provides a variety of ways to go about categorizing your documents. [block:api-header] { "type": "basic", "title": "Queries" } [/block] Queries are saved Boolean searches that are run against all the documents you submit. If the search hits the document, we return the name of the query, the hit count, and the sentiment of the query. Queries are very precise and transparent. You can see exactly what you searched for and why it hit. Only what you search for will trigger a hit. This makes query-based classification a good fit when the bucket is easy to define. For instance,if you are looking for all documents where someone talks about their iPhone, you can easily write a query to find that. The downside to using queries for categorization is that for buckets that represent more of a concept it can be difficult and time-consuming to construct a query that fits. You have to worry about synonyms, ambiguous terms, and so on. For instance, if you are looking for all documents about mobile technology, you need to search for the various types of mobile technologies, exclude those documents that talk about mobility and technology separately and so on. [block:api-header] { "type": "basic", "title": "Categories" } [/block] Categories are saved searches constructed using our Concept Matrix language. They are run against all documents you submit. Each category is evaluated and given a percentage relevance score against the document. If the score is higher than your category threshold, we return the category name, the relevancy score, and the sentiment of the category. Category queries use our Concept Matrix to extend the terms you submit. For instance, a category of "food" will score well on a document such as "I had some chicken wings the other day that were just awesome." This makes categories a good choice to match broader buckets, such as sports, food, art, or technology. The downside to using categories for categorization is that it is not transparent. You will hit documents that do not have any of your terms in them, and it will be difficult to tell exactly why they hit. In addition, categories do not work well for very short content such as tweets, because categories need context to work with. [block:api-header] { "type": "basic", "title": "Auto categories" } [/block] Auto categories are categories that we built for you that match the Wikipedia taxonomy. There are about 4,000 entries and the taxonomy is three levels deep. The taxonomy and the categories are not user-modifiable. If one of the categories hits your document, you will receive back the name of the category along with the sentiment, the ancestor nodes of the category, the score, and the URL to the Wikipedia page representing the category. Auto categories provide mainly broad subject areas such as Agriculture, Physics, Computers and so on, although some of the leaves are fairly granular. For a complete list, see <here> [block:api-header] { "type": "basic", "title": "Machine Learning" } [/block] Machine learning is a categorization technique where instead of the user writing queries the user assembles a set of documents each tagged with the appropriate bucket. Once you have a set of data, a machine learning process is run to create a statistical model from the documents. For instance, if you tagged lots of documents with the bucket "Mobile Phones" and most of them had the word iPhone in them, the machine will learn that iPhone is closely related to the bucket Mobile Phones. When it sees a new document with the word iPhone in it, it will give a high confidence score for the category of Mobile Phones to that document. Lexalytics supports various machine learning models. They are not supported in Semantria at present.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d6","createdAt":"2015-08-13T17:17:38.226Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":13,"body":"\"Entity\" is a word for some kind of proper noun - a person, place, or thing. The \"thing\" could be an organization such as Lexalytics, a product, such as an iPhone, or any other type of thing you need to track. Recognizing that a word or phrase is a particular type of thing is called entity recognition. Entity recognition is different from using searches to find particular entities, because entity recognition finds them without you having to specify the particular ones you want. For instance, entity recognition allows you to ask questions like \"What companies I am not currently tracking are being mentioned along with mine?\"\n\nLexalytics provides two ways to recognize entities. We find them out of the box by ourselves, and you can also import your own dictionary of entities for us to find. Perhaps you have a list of SKUs you need to track, or you want to recognize hotel amenities (swimming pool, front desk and so on) as entities. You can import these, and you can use our query syntax (AND, OR, NOT, NEAR, WITH) to define them. \n\nWe also support normalization. This is something done to roll variations of entities together. For instance, companies are sometimes mentioned by their full legal name (Cisco Systems, Inc), sometimes by their bare name (Cisco) and sometimes by something else entirely, like their stock ticker symbol (CSCO). For reporting purposes, you want all of these to be treated as the same entity - its not really relevant if someone used the stock ticker vs the name. \n\nFinally, you can label an entity as a different type if you so choose. Instead of using a type of \"company\" for your competitors, perhaps you want to label them all as \"competitors\" instead so you can easily find all competitors in your report.\n\nEntity names are returned to you in the output, along with their type, their label, the normalized form, whether they were found by us or defined by you, their sentiment, themes, and the number of times they were mentioned. Sometimes multiple mentions of an entity are found and linked together. This happens a lot for people - the first mention might be President Barack Obama, but the second mention is just Obama, and the third is the pronoun \"he.\" We link those together and report the longest mention - President Barack Obama - as the name of the entity, along with three mentions. By default we don't report each individual mention back to you in Semantria, but that can be enabled if desired.","excerpt":"","slug":"entity-recognition","type":"basic","title":"Entity Recognition","__v":0,"childrenPages":[]}

Entity Recognition


"Entity" is a word for some kind of proper noun - a person, place, or thing. The "thing" could be an organization such as Lexalytics, a product, such as an iPhone, or any other type of thing you need to track. Recognizing that a word or phrase is a particular type of thing is called entity recognition. Entity recognition is different from using searches to find particular entities, because entity recognition finds them without you having to specify the particular ones you want. For instance, entity recognition allows you to ask questions like "What companies I am not currently tracking are being mentioned along with mine?" Lexalytics provides two ways to recognize entities. We find them out of the box by ourselves, and you can also import your own dictionary of entities for us to find. Perhaps you have a list of SKUs you need to track, or you want to recognize hotel amenities (swimming pool, front desk and so on) as entities. You can import these, and you can use our query syntax (AND, OR, NOT, NEAR, WITH) to define them. We also support normalization. This is something done to roll variations of entities together. For instance, companies are sometimes mentioned by their full legal name (Cisco Systems, Inc), sometimes by their bare name (Cisco) and sometimes by something else entirely, like their stock ticker symbol (CSCO). For reporting purposes, you want all of these to be treated as the same entity - its not really relevant if someone used the stock ticker vs the name. Finally, you can label an entity as a different type if you so choose. Instead of using a type of "company" for your competitors, perhaps you want to label them all as "competitors" instead so you can easily find all competitors in your report. Entity names are returned to you in the output, along with their type, their label, the normalized form, whether they were found by us or defined by you, their sentiment, themes, and the number of times they were mentioned. Sometimes multiple mentions of an entity are found and linked together. This happens a lot for people - the first mention might be President Barack Obama, but the second mention is just Obama, and the third is the pronoun "he." We link those together and report the longest mention - President Barack Obama - as the name of the entity, along with three mentions. By default we don't report each individual mention back to you in Semantria, but that can be enabled if desired.
"Entity" is a word for some kind of proper noun - a person, place, or thing. The "thing" could be an organization such as Lexalytics, a product, such as an iPhone, or any other type of thing you need to track. Recognizing that a word or phrase is a particular type of thing is called entity recognition. Entity recognition is different from using searches to find particular entities, because entity recognition finds them without you having to specify the particular ones you want. For instance, entity recognition allows you to ask questions like "What companies I am not currently tracking are being mentioned along with mine?" Lexalytics provides two ways to recognize entities. We find them out of the box by ourselves, and you can also import your own dictionary of entities for us to find. Perhaps you have a list of SKUs you need to track, or you want to recognize hotel amenities (swimming pool, front desk and so on) as entities. You can import these, and you can use our query syntax (AND, OR, NOT, NEAR, WITH) to define them. We also support normalization. This is something done to roll variations of entities together. For instance, companies are sometimes mentioned by their full legal name (Cisco Systems, Inc), sometimes by their bare name (Cisco) and sometimes by something else entirely, like their stock ticker symbol (CSCO). For reporting purposes, you want all of these to be treated as the same entity - its not really relevant if someone used the stock ticker vs the name. Finally, you can label an entity as a different type if you so choose. Instead of using a type of "company" for your competitors, perhaps you want to label them all as "competitors" instead so you can easily find all competitors in your report. Entity names are returned to you in the output, along with their type, their label, the normalized form, whether they were found by us or defined by you, their sentiment, themes, and the number of times they were mentioned. Sometimes multiple mentions of an entity are found and linked together. This happens a lot for people - the first mention might be President Barack Obama, but the second mention is just Obama, and the third is the pronoun "he." We link those together and report the longest mention - President Barack Obama - as the name of the entity, along with three mentions. By default we don't report each individual mention back to you in Semantria, but that can be enabled if desired.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d7","createdAt":"2015-08-13T17:49:53.672Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":14,"body":"Extracting important words from a piece of text is known in NLP variously as theme, key word or key phrase extraction. At Lexalytics, we call it theme extraction. Themes are found by looking for phrases that match POS patterns - mostly descriptive noun phrases such as \"delicious seafood\" or \"seared scallops.\" We do not extract single words as single words are often very noisy.\n\nThemes do not extract entities, since entities are proper nouns. If you want to find companies, products, people and so on, see our entity recognition feature.\n\nWhen we find candidate themes in the text, we also look at their relevance to the entire document. If the theme doesn't seem to be relevant to the rest of the document, it is dropped. If it is retained, it receives a relevance score. This score is only useful for ranking themes within a document against each other, not across documents. In addition to the score and the theme itself, we also report back a stemmed and lower-cased version of the theme, the sentiment, and the theme summary.\n\nThemes are not easily tunable, unlike most of our other features, but you can use the blacklist feature in Semantria to suppress themes containing words you do not find useful or relevant. Additional tuning is possible via our professional services.","excerpt":"","slug":"themes","type":"basic","title":"Themes","__v":0,"childrenPages":[]}

Themes


Extracting important words from a piece of text is known in NLP variously as theme, key word or key phrase extraction. At Lexalytics, we call it theme extraction. Themes are found by looking for phrases that match POS patterns - mostly descriptive noun phrases such as "delicious seafood" or "seared scallops." We do not extract single words as single words are often very noisy. Themes do not extract entities, since entities are proper nouns. If you want to find companies, products, people and so on, see our entity recognition feature. When we find candidate themes in the text, we also look at their relevance to the entire document. If the theme doesn't seem to be relevant to the rest of the document, it is dropped. If it is retained, it receives a relevance score. This score is only useful for ranking themes within a document against each other, not across documents. In addition to the score and the theme itself, we also report back a stemmed and lower-cased version of the theme, the sentiment, and the theme summary. Themes are not easily tunable, unlike most of our other features, but you can use the blacklist feature in Semantria to suppress themes containing words you do not find useful or relevant. Additional tuning is possible via our professional services.
Extracting important words from a piece of text is known in NLP variously as theme, key word or key phrase extraction. At Lexalytics, we call it theme extraction. Themes are found by looking for phrases that match POS patterns - mostly descriptive noun phrases such as "delicious seafood" or "seared scallops." We do not extract single words as single words are often very noisy. Themes do not extract entities, since entities are proper nouns. If you want to find companies, products, people and so on, see our entity recognition feature. When we find candidate themes in the text, we also look at their relevance to the entire document. If the theme doesn't seem to be relevant to the rest of the document, it is dropped. If it is retained, it receives a relevance score. This score is only useful for ranking themes within a document against each other, not across documents. In addition to the score and the theme itself, we also report back a stemmed and lower-cased version of the theme, the sentiment, and the theme summary. Themes are not easily tunable, unlike most of our other features, but you can use the blacklist feature in Semantria to suppress themes containing words you do not find useful or relevant. Additional tuning is possible via our professional services.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d8","createdAt":"2015-08-13T20:54:22.961Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":15,"body":"Lexical chains are a technique used to create topic-specific summaries in text. At Lexalytics we use synonym expansion and other techniques to construct chains of topics in pieces of text longer than three sentences. These chains form the basis of our extraction specific summaries and sentiment - for instance entity level sentiment.\n\nThe chains themselves are not exposed to the end user.","excerpt":"","slug":"lexical-chaining","type":"basic","title":"Lexical Chaining","__v":0,"childrenPages":[]}

Lexical Chaining


Lexical chains are a technique used to create topic-specific summaries in text. At Lexalytics we use synonym expansion and other techniques to construct chains of topics in pieces of text longer than three sentences. These chains form the basis of our extraction specific summaries and sentiment - for instance entity level sentiment. The chains themselves are not exposed to the end user.
Lexical chains are a technique used to create topic-specific summaries in text. At Lexalytics we use synonym expansion and other techniques to construct chains of topics in pieces of text longer than three sentences. These chains form the basis of our extraction specific summaries and sentiment - for instance entity level sentiment. The chains themselves are not exposed to the end user.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d9","createdAt":"2015-08-13T20:54:48.252Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":16,"body":"Lexalytics provides a library to detect the language of a piece of text, primarily those languages we support directly. We also detect languages that might be easily confused with a supported one, such as Bulgarian versus Russian. Texts can be made of pieces from different languages, for instance intermingled Spanish and English is quite common in some social media. We do not try to identify the individual pieces, only judge what the entire text is most likely to be.\n\nIn our Semantria product, language detection is done at submission time and cannot be used as a way to route content in our system. The content will go to the configuration you specified regardless of the language detected. However, you can use the detected value to see if you are submitting content in a language different from the configuration language.","excerpt":"","slug":"language-detection","type":"basic","title":"Language Detection","__v":0,"childrenPages":[]}

Language Detection


Lexalytics provides a library to detect the language of a piece of text, primarily those languages we support directly. We also detect languages that might be easily confused with a supported one, such as Bulgarian versus Russian. Texts can be made of pieces from different languages, for instance intermingled Spanish and English is quite common in some social media. We do not try to identify the individual pieces, only judge what the entire text is most likely to be. In our Semantria product, language detection is done at submission time and cannot be used as a way to route content in our system. The content will go to the configuration you specified regardless of the language detected. However, you can use the detected value to see if you are submitting content in a language different from the configuration language.
Lexalytics provides a library to detect the language of a piece of text, primarily those languages we support directly. We also detect languages that might be easily confused with a supported one, such as Bulgarian versus Russian. Texts can be made of pieces from different languages, for instance intermingled Spanish and English is quite common in some social media. We do not try to identify the individual pieces, only judge what the entire text is most likely to be. In our Semantria product, language detection is done at submission time and cannot be used as a way to route content in our system. The content will go to the configuration you specified regardless of the language detected. However, you can use the detected value to see if you are submitting content in a language different from the configuration language.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2da","createdAt":"2015-08-14T14:31:31.398Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":17,"body":"Lexalytics provides a summary of the overall text as well as contextual summaries of extracted elements such as entities and themes. Our Semantria product returns only the summary of the document overall. The document summary contains the most important sentences in the document. Showing the summary of a document can be useful for longer pieces of text such as news articles.","excerpt":"","slug":"summarization-1","type":"basic","title":"Summarization","__v":0,"childrenPages":[]}

Summarization


Lexalytics provides a summary of the overall text as well as contextual summaries of extracted elements such as entities and themes. Our Semantria product returns only the summary of the document overall. The document summary contains the most important sentences in the document. Showing the summary of a document can be useful for longer pieces of text such as news articles.
Lexalytics provides a summary of the overall text as well as contextual summaries of extracted elements such as entities and themes. Our Semantria product returns only the summary of the document overall. The document summary contains the most important sentences in the document. Showing the summary of a document can be useful for longer pieces of text such as news articles.
{"category":"577e4bf24159cd1900d5d2ae","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2db","createdAt":"2015-08-14T14:53:03.193Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":18,"body":"Lexalytics extracts relationships between entities, such as occupation, or a company hiring someone. These are extracted based on textual patterns \nThese are pre-defined and can only be modified by working with our professional services team.","excerpt":"","slug":"relationships","type":"basic","title":"Relationships","__v":0,"childrenPages":[]}

Relationships


Lexalytics extracts relationships between entities, such as occupation, or a company hiring someone. These are extracted based on textual patterns These are pre-defined and can only be modified by working with our professional services team.
Lexalytics extracts relationships between entities, such as occupation, or a company hiring someone. These are extracted based on textual patterns These are pre-defined and can only be modified by working with our professional services team.
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":["577a3574aea88b0e00f6331f"],"_id":"577e4bf24159cd1900d5d2e7","createdAt":"2015-07-07T21:24:09.251Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":19,"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Signup for FREE\",\n  \"body\": \"Sign up for a free trial account, and your key and secret will be emailed to you. Once you get your credentials you can start using Semantria.\\n[Signup now](http://www.lexalytics.com/signup)\"\n}\n[/block]","excerpt":"","slug":"obtain-your-creds","type":"basic","title":"Semantria API documentation","__v":0,"childrenPages":[]}

Semantria API documentation


[block:callout] { "type": "info", "title": "Signup for FREE", "body": "Sign up for a free trial account, and your key and secret will be emailed to you. Once you get your credentials you can start using Semantria.\n[Signup now](http://www.lexalytics.com/signup)" } [/block]
[block:callout] { "type": "info", "title": "Signup for FREE", "body": "Sign up for a free trial account, and your key and secret will be emailed to you. Once you get your credentials you can start using Semantria.\n[Signup now](http://www.lexalytics.com/signup)" } [/block]
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"557b06b0ebf0e22f00c45c96","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e8","createdAt":"2015-11-05T15:36:25.916Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":20,"body":"Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app.\n\nIf you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return.\n\nSemantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk).\n[block:parameters]\n{\n  \"data\": {\n    \"0-0\": \"**Java**\",\n    \"0-1\": \"37 KB\",\n    \"0-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)\",\n    \"1-0\": \"**.NET**\",\n    \"1-1\": \"93 KB\",\n    \"1-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)\",\n    \"2-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)\",\n    \"3-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)\",\n    \"4-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)\",\n    \"5-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)\",\n    \"2-0\": \"**PHP**\",\n    \"2-1\": \"16 KB\",\n    \"3-0\": \"**Python**\",\n    \"3-1\": \"32 KB\",\n    \"4-0\": \"**Ruby**\",\n    \"4-1\": \"15 KB\",\n    \"5-0\": \"**JavaScript**\",\n    \"5-1\": \"18 KB\",\n    \"6-0\": \"**Node.js**\",\n    \"6-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)\",\n    \"6-1\": \"153 KB\"\n  },\n  \"cols\": 3,\n  \"rows\": 7\n}\n[/block]","excerpt":"","slug":"install-the-sdk","type":"basic","title":"Install the SDK","__v":0,"childrenPages":[]}

Install the SDK


Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app. If you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return. Semantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk). [block:parameters] { "data": { "0-0": "**Java**", "0-1": "37 KB", "0-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)", "1-0": "**.NET**", "1-1": "93 KB", "1-2": "[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)", "2-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)", "3-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)", "4-2": "[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)", "5-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)", "2-0": "**PHP**", "2-1": "16 KB", "3-0": "**Python**", "3-1": "32 KB", "4-0": "**Ruby**", "4-1": "15 KB", "5-0": "**JavaScript**", "5-1": "18 KB", "6-0": "**Node.js**", "6-2": "[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)", "6-1": "153 KB" }, "cols": 3, "rows": 7 } [/block]
Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app. If you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return. Semantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk). [block:parameters] { "data": { "0-0": "**Java**", "0-1": "37 KB", "0-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)", "1-0": "**.NET**", "1-1": "93 KB", "1-2": "[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)", "2-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)", "3-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)", "4-2": "[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)", "5-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)", "2-0": "**PHP**", "2-1": "16 KB", "3-0": "**Python**", "3-1": "32 KB", "4-0": "**Ruby**", "4-1": "15 KB", "5-0": "**JavaScript**", "5-1": "18 KB", "6-0": "**Node.js**", "6-2": "[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)", "6-1": "153 KB" }, "cols": 3, "rows": 7 } [/block]
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"557b06b0ebf0e22f00c45c96","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e9","createdAt":"2015-11-05T15:36:48.817Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":21,"body":"Semantria uses a variant of the OAUTH specification, and the easiest way to get started is to use one of our SDKs which will handle the authentication details for you. \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"import semantria\\nsession = semantria.Session( key, secret )\",\n      \"language\": \"python\"\n    },\n    {\n      \"code\": \"namespace Quickstart\\n{\\n    class Program\\n    {\\n        static void Main(string[] args)\\n        {\\n\\n            // Replace with your API key and secret\\n            string API_KEY = \\\"\\\";\\n            string API_SECRET = \\\"\\\";\\n          // Instantiate a Semantria session\\n            ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\\n            Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n\n and now you are ready to analyze some content with that session.","excerpt":"","slug":"authenticate","type":"basic","title":"Authenticate","__v":0,"childrenPages":[]}

Authenticate


Semantria uses a variant of the OAUTH specification, and the easiest way to get started is to use one of our SDKs which will handle the authentication details for you. [block:code] { "codes": [ { "code": "import semantria\nsession = semantria.Session( key, secret )", "language": "python" }, { "code": "namespace Quickstart\n{\n class Program\n {\n static void Main(string[] args)\n {\n\n // Replace with your API key and secret\n string API_KEY = \"\";\n string API_SECRET = \"\";\n // Instantiate a Semantria session\n ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\n Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);", "language": "csharp" } ] } [/block] and now you are ready to analyze some content with that session.
Semantria uses a variant of the OAUTH specification, and the easiest way to get started is to use one of our SDKs which will handle the authentication details for you. [block:code] { "codes": [ { "code": "import semantria\nsession = semantria.Session( key, secret )", "language": "python" }, { "code": "namespace Quickstart\n{\n class Program\n {\n static void Main(string[] args)\n {\n\n // Replace with your API key and secret\n string API_KEY = \"\";\n string API_SECRET = \"\";\n // Instantiate a Semantria session\n ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\n Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);", "language": "csharp" } ] } [/block] and now you are ready to analyze some content with that session.
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ea","createdAt":"2015-09-15T18:09:38.366Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":22,"body":"Now that you have your key and secret, you can send documents to Semantria. Since Semantria is asynchronous, you will need to request them as well. Here is a simple snippet to submit three documents for analysis and retrieve them. A more worked out example that handles errors and prints results is shown in our Quick Start Guides.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"import semantria, time, uuid\\n\\n#some sample text\\ninitialTexts = [\\n    \\\"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\\\",\\n    \\\"In Lake Louise - a guided walk for the family with Great Divide Nature Tours  rent a canoe on Lake Louise or Moraine Lake  go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff  a picnic at Johnson Lake  rent a boat at Lake Minnewanka  hike up Tunnel Mountain  walk to the Bow Falls and the Fairmont Banff Springs Hotel  visit the Banff Park Museum. The \\\\\\\"must-do\\\\\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\\\",\\n    \\\"On this day in 1786 - In New York City  commercial ice cream was manufactured for the first time.\\\"\\n]\\nfor text in initialTexts:\\n   doc = {\\\"id\\\": str(uuid.uuid4()).replace(\\\"-\\\", \\\"\\\"), \\\"text\\\": text}\\n\\nsession = semantria.Session(key, secret)\\n#queue a batch of documents\\nsession.queueBatch(initialTexts)\\n#keep track of how many we sent\\nnumber_of_docs = len(initialTexts)\\nresults = []\\n#keep polling for results until we got everything back\\nwhile len(results) < number_of_docs:\\n  time.sleep(3)\\n  status = session.getProcessedDocuments()\\n  if isinstance(status, list):\\n        for object_ in status:\\n            results.append(object_)\\n          \",\n      \"language\": \"python\"\n    },\n    {\n      \"code\": \"using System;\\nusing System.Collections.Generic;\\nusing System.Linq;\\n\\nusing Semantria.Com;\\n\\nnamespace Quickstart\\n{\\n    class Program\\n    {\\n        static void Main(string[] args)\\n        {\\n\\n            // Replace with your API key and secret\\n            string API_KEY = \\\"\\\";\\n            string API_SECRET = \\\"\\\";\\n            \\n            // Some sample text\\n            List<String> myText = new List<string>();\\n            myText.Add(\\\"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\\\");\\n            myText.Add(\\\"In Lake Louise - a guided walk for the family with Great Divide Nature Tours  rent a canoe on Lake Louise or Moraine Lake  go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff  a picnic at Johnson Lake  rent a boat at Lake Minnewanka  hike up Tunnel Mountain  walk to the Bow Falls and the Fairmont Banff Springs Hotel  visit the Banff Park Museum. The \\\\\\\"must-do\\\\\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\\\");\\n            myText.Add(\\\"On this day in 1786 - In New York City  commercial ice cream was manufactured for the first time.\\\");\\n\\n            // Instantiate a Semantria session\\n            ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\\n            Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);\\n\\n            // Generate Semantria Document list to send for processing\\n            List<Semantria.Com.Mapping.Document> myOutgoingDocuments = new List<Semantria.Com.Mapping.Document>(myText.Count);\\n            foreach (string aText in myText)\\n            {\\n                string DocId = Guid.NewGuid().ToString();\\n                Semantria.Com.Mapping.Document aDocument = new Semantria.Com.Mapping.Document(){Id = DocId, Text = aText};\\n                myOutgoingDocuments.Add(aDocument);\\n            }\\n\\n            // Queue a batch of documents\\n            mySession.QueueBatchOfDocuments(myOutgoingDocuments);\\n            \\n            // Prepare a list for results\\n            List<Semantria.Com.Mapping.Output.DocAnalyticData> myResults = new List<Semantria.Com.Mapping.Output.DocAnalyticData>();\\n            foreach (Semantria.Com.Mapping.Document aDocument in myOutgoingDocuments)\\n            {\\n                Semantria.Com.Mapping.Output.DocAnalyticData aResult = new Semantria.Com.Mapping.Output.DocAnalyticData();\\n                aResult.Id = aDocument.Id;\\n                aResult.Status = Semantria.Com.TaskStatus.QUEUED;\\n                myResults.Add(aResult);\\n            }\\n\\n            // Poll for results until we've got results for everything we sent\\n            while (myResults.Any(item => item.Status == TaskStatus.QUEUED))\\n            {\\n                // Wait 3 seconds in between each poll for results\\n                System.Threading.Thread.Sleep(3000);\\n\\n                // Check for results\\n                IList<Semantria.Com.Mapping.Output.DocAnalyticData> myIncomingResults = mySession.GetProcessedDocuments();\\n                foreach (Semantria.Com.Mapping.Output.DocAnalyticData aIncomingResult in myIncomingResults)\\n                {\\n                    for (int i = 0; i < myResults.Count; i++)\\n                    {\\n                        if (myResults[i].Id == aIncomingResult.Id)\\n                        {\\n                            myResults[i] = aIncomingResult;\\n                            break;\\n                        }\\n                    }\\n                }\\n            }\\n        }\\n    }\\n}\\n\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]","excerpt":"","slug":"send-some-documents","type":"basic","title":"Process some documents","__v":0,"childrenPages":[]}

Process some documents


Now that you have your key and secret, you can send documents to Semantria. Since Semantria is asynchronous, you will need to request them as well. Here is a simple snippet to submit three documents for analysis and retrieve them. A more worked out example that handles errors and prints results is shown in our Quick Start Guides. [block:code] { "codes": [ { "code": "import semantria, time, uuid\n\n#some sample text\ninitialTexts = [\n \"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\",\n \"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\",\n \"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\"\n]\nfor text in initialTexts:\n doc = {\"id\": str(uuid.uuid4()).replace(\"-\", \"\"), \"text\": text}\n\nsession = semantria.Session(key, secret)\n#queue a batch of documents\nsession.queueBatch(initialTexts)\n#keep track of how many we sent\nnumber_of_docs = len(initialTexts)\nresults = []\n#keep polling for results until we got everything back\nwhile len(results) < number_of_docs:\n time.sleep(3)\n status = session.getProcessedDocuments()\n if isinstance(status, list):\n for object_ in status:\n results.append(object_)\n ", "language": "python" }, { "code": "using System;\nusing System.Collections.Generic;\nusing System.Linq;\n\nusing Semantria.Com;\n\nnamespace Quickstart\n{\n class Program\n {\n static void Main(string[] args)\n {\n\n // Replace with your API key and secret\n string API_KEY = \"\";\n string API_SECRET = \"\";\n \n // Some sample text\n List<String> myText = new List<string>();\n myText.Add(\"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\");\n myText.Add(\"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\");\n myText.Add(\"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\");\n\n // Instantiate a Semantria session\n ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\n Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);\n\n // Generate Semantria Document list to send for processing\n List<Semantria.Com.Mapping.Document> myOutgoingDocuments = new List<Semantria.Com.Mapping.Document>(myText.Count);\n foreach (string aText in myText)\n {\n string DocId = Guid.NewGuid().ToString();\n Semantria.Com.Mapping.Document aDocument = new Semantria.Com.Mapping.Document(){Id = DocId, Text = aText};\n myOutgoingDocuments.Add(aDocument);\n }\n\n // Queue a batch of documents\n mySession.QueueBatchOfDocuments(myOutgoingDocuments);\n \n // Prepare a list for results\n List<Semantria.Com.Mapping.Output.DocAnalyticData> myResults = new List<Semantria.Com.Mapping.Output.DocAnalyticData>();\n foreach (Semantria.Com.Mapping.Document aDocument in myOutgoingDocuments)\n {\n Semantria.Com.Mapping.Output.DocAnalyticData aResult = new Semantria.Com.Mapping.Output.DocAnalyticData();\n aResult.Id = aDocument.Id;\n aResult.Status = Semantria.Com.TaskStatus.QUEUED;\n myResults.Add(aResult);\n }\n\n // Poll for results until we've got results for everything we sent\n while (myResults.Any(item => item.Status == TaskStatus.QUEUED))\n {\n // Wait 3 seconds in between each poll for results\n System.Threading.Thread.Sleep(3000);\n\n // Check for results\n IList<Semantria.Com.Mapping.Output.DocAnalyticData> myIncomingResults = mySession.GetProcessedDocuments();\n foreach (Semantria.Com.Mapping.Output.DocAnalyticData aIncomingResult in myIncomingResults)\n {\n for (int i = 0; i < myResults.Count; i++)\n {\n if (myResults[i].Id == aIncomingResult.Id)\n {\n myResults[i] = aIncomingResult;\n break;\n }\n }\n }\n }\n }\n }\n}\n", "language": "csharp" } ] } [/block]
Now that you have your key and secret, you can send documents to Semantria. Since Semantria is asynchronous, you will need to request them as well. Here is a simple snippet to submit three documents for analysis and retrieve them. A more worked out example that handles errors and prints results is shown in our Quick Start Guides. [block:code] { "codes": [ { "code": "import semantria, time, uuid\n\n#some sample text\ninitialTexts = [\n \"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\",\n \"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\",\n \"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\"\n]\nfor text in initialTexts:\n doc = {\"id\": str(uuid.uuid4()).replace(\"-\", \"\"), \"text\": text}\n\nsession = semantria.Session(key, secret)\n#queue a batch of documents\nsession.queueBatch(initialTexts)\n#keep track of how many we sent\nnumber_of_docs = len(initialTexts)\nresults = []\n#keep polling for results until we got everything back\nwhile len(results) < number_of_docs:\n time.sleep(3)\n status = session.getProcessedDocuments()\n if isinstance(status, list):\n for object_ in status:\n results.append(object_)\n ", "language": "python" }, { "code": "using System;\nusing System.Collections.Generic;\nusing System.Linq;\n\nusing Semantria.Com;\n\nnamespace Quickstart\n{\n class Program\n {\n static void Main(string[] args)\n {\n\n // Replace with your API key and secret\n string API_KEY = \"\";\n string API_SECRET = \"\";\n \n // Some sample text\n List<String> myText = new List<string>();\n myText.Add(\"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\");\n myText.Add(\"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream. Have a Fanta while you're there.\");\n myText.Add(\"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\");\n\n // Instantiate a Semantria session\n ISerializer serializer = new Semantria.Com.Serializers.JsonSerializer();\n Semantria.Com.Session mySession = Semantria.Com.Session.CreateSession(API_KEY, API_SECRET, serializer);\n\n // Generate Semantria Document list to send for processing\n List<Semantria.Com.Mapping.Document> myOutgoingDocuments = new List<Semantria.Com.Mapping.Document>(myText.Count);\n foreach (string aText in myText)\n {\n string DocId = Guid.NewGuid().ToString();\n Semantria.Com.Mapping.Document aDocument = new Semantria.Com.Mapping.Document(){Id = DocId, Text = aText};\n myOutgoingDocuments.Add(aDocument);\n }\n\n // Queue a batch of documents\n mySession.QueueBatchOfDocuments(myOutgoingDocuments);\n \n // Prepare a list for results\n List<Semantria.Com.Mapping.Output.DocAnalyticData> myResults = new List<Semantria.Com.Mapping.Output.DocAnalyticData>();\n foreach (Semantria.Com.Mapping.Document aDocument in myOutgoingDocuments)\n {\n Semantria.Com.Mapping.Output.DocAnalyticData aResult = new Semantria.Com.Mapping.Output.DocAnalyticData();\n aResult.Id = aDocument.Id;\n aResult.Status = Semantria.Com.TaskStatus.QUEUED;\n myResults.Add(aResult);\n }\n\n // Poll for results until we've got results for everything we sent\n while (myResults.Any(item => item.Status == TaskStatus.QUEUED))\n {\n // Wait 3 seconds in between each poll for results\n System.Threading.Thread.Sleep(3000);\n\n // Check for results\n IList<Semantria.Com.Mapping.Output.DocAnalyticData> myIncomingResults = mySession.GetProcessedDocuments();\n foreach (Semantria.Com.Mapping.Output.DocAnalyticData aIncomingResult in myIncomingResults)\n {\n for (int i = 0; i < myResults.Count; i++)\n {\n if (myResults[i].Id == aIncomingResult.Id)\n {\n myResults[i] = aIncomingResult;\n break;\n }\n }\n }\n }\n }\n }\n}\n", "language": "csharp" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2eb","createdAt":"2015-09-08T17:02:24.986Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":23,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Detailed mode test app\"\n}\n[/block]\n**Step 1:** Install the SDK. If you have pip, just type `pip install semantria_sdk` in the command line (if you don't have it, [download pip](https://pypi.python.org/pypi/pip)).\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"There are two kinds of analysis in the Semantria API: Detailed analysis and Discovery analysis. Detailed analysis processes individual documents (whether you send one or many documents) and returns the sentiment score, themes, and entities of each document. Discovery analysis processes documents as one collection and returns the facets and attributes that appear across many documents.\"\n}\n[/block]\n**Step 2:** Create a file called `detailed_test_app.py` and add the following lines:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"from __future__ import print_function\\nimport semantria\\nimport uuid\\nimport time\\n\\nserializer = semantria.JsonSerializer()\",\n      \"language\": \"python\",\n      \"name\": \"detailed_test_app.py\"\n    }\n  ]\n}\n[/block]\n**Step 3:** Next, let's start the Semantria session. You'll need to input your API Key and Secret (to get a Key and Secret, sign up for a free trial here).You can create variables `semantria_key` and `semantria_secret` or input them as strings. In this example, we did not use variables.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"session = semantria.Session(\\\"key00000-0000-0000-0000-000000000000\\\", \\\"secret0-0000-0000-0000-000000000000\\\", serializer, use_compression=True)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 4:** Now we'll write some text to analyze. We used these three examples, but feel free to make up your own.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"initialTexts = [\\n    \\\"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\\\",\\n    \\\"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake  go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff  a picnic at Johnson Lake rent a boat at Lake Minnewanka  hike up Tunnel Mountain  walk to the Bow Falls and the Fairmont Banff Springs Hotel  visit the Banff Park Museum. The \\\\\\\"must-do\\\\\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream.\\\",\\n    \\\"On this day in 1786 - In New York City  commercial ice cream was manufactured for the first time.\\\"\\n]\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 5:** Next, we'll create unique document IDs for each text sample we just wrote.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"for text in initialTexts:\\n   doc = {\\\"id\\\": str(uuid.uuid4()).replace(\\\"-\\\", \\\"\\\"), \\\"text\\\": text}\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 6**: Now we will queue our documents. Since this is an HTTP API, every request will return an HTTP status. Semantria has a whole list of custom HTTP statuses but for now, we are just checking if our documents have been queued properly (which will be a 202 status).\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"   status = session.queueDocument(doc)\\n   if status == 202:\\n      print(\\\"\\\\\\\"\\\", doc[\\\"id\\\"], \\\"\\\\\\\" document queued successfully.\\\", \\\"\\\\r\\\\n\\\")\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 7:** Semantria requires one call for queueing and another for requesting the processed documents. In this toy example, we will loop through until all our documents have been processed. We check every two seconds if there are new documents and add them to our results until all documents are processed.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"length = len(initialTexts)\\nresults = []\\n\\nwhile len(results) < length:\\n   print(\\\"Retrieving your processed results...\\\", \\\"\\\\r\\\\n\\\")\\n   time.sleep(2)\\n   # get processed documents\\n   status = session.getProcessedDocuments()\\n   results.extend(status)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"danger\",\n  \"body\": \"In a real application, you will want to have one job for queueing and another for requesting. `time.sleep()` will stop the entire program, so it will stop queueing and requesting; don't use it in your real application!\"\n}\n[/block]\n**Step 8:** Now we will print our processed results. It will print the document sentiment score, themes, entities, and their accompanying sentiment scores for each document.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"for data in results:\\n   # print document sentiment score\\n   print(\\\"Document \\\", data[\\\"id\\\"], \\\" Sentiment score: \\\", data[\\\"sentiment_score\\\"], \\\"\\\\r\\\\n\\\")\\n\\n   # print document themes\\n   if \\\"themes\\\" in data:\\n      print(\\\"Document themes:\\\", \\\"\\\\r\\\\n\\\")\\n      for theme in data[\\\"themes\\\"]:\\n         print(\\\"     \\\", theme[\\\"title\\\"], \\\" (sentiment: \\\", theme[\\\"sentiment_score\\\"], \\\")\\\", \\\"\\\\r\\\\n\\\")\\n\\n   # print document entities\\n   if \\\"entities\\\" in data:\\n      print(\\\"Entities:\\\", \\\"\\\\r\\\\n\\\")\\n      for entity in data[\\\"entities\\\"]:\\n         print(\\\"\\\\t\\\", entity[\\\"title\\\"], \\\" : \\\", entity[\\\"entity_type\\\"],\\\" (sentiment: \\\", entity[\\\"sentiment_score\\\"], \\\")\\\", \\\"\\\\r\\\\n\\\")\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\nThat's it! Here is the expected output for `detailed_test_app.py`:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"\\\" 64f5abc2fe604890a6e730ad0b8e3ff6 \\\" document queued successfully.\\n\\\" 3c5231606f874770b340393eb124b8cd \\\" document queued successfully.\\n\\\" 94d304bf9eb44f28975f5113d394efea \\\" document queued successfully.\\n\\nRetrieving your processed results...\\n\\nDocument  64f5abc2fe604890a6e730ad0b8e3ff6  Sentiment score:  -0.36575\\nDocument themes:\\n      skinny cow ice cream coupons  (sentiment:  -0.3 )\\n      Skinny cow coupons  (sentiment:  -0.3 )\\n      tiny cup  (sentiment:  -0.36071876 )\\n\\nDocument  3c5231606f874770b340393eb124b8cd  Sentiment score:  0.54\\nDocument themes:\\n      guided walk  (sentiment:  0.57000005 )\\n      short walks  (sentiment:  0.52500004 )\\n      Banff Gondola  (sentiment:  0.26250002 )\\n      candy shops  (sentiment:  0.13125001 )\\n      ice cream  (sentiment:  0.065625004 )\\nEntities:\\n\\t Lake Louise  :  Place  (sentiment:  1.2 )\\n\\t Banff Avenue  :  Place  (sentiment:  0.4 )\\n\\t Moraine Lake  :  Place  (sentiment:  0.6 )\\n\\t Agnes Tea House  :  Place  (sentiment:  0.6 )\\n\\t Fairmont Banff Springs Hotel  :  Company  (sentiment:  0.0 )\\n\\nDocument  94d304bf9eb44f28975f5113d394efea  Sentiment score:  0.18291481\\nDocument themes:\\n      commercial ice cream  (sentiment:  0.18291481 )\\nEntities:\\n\\t New York City  :  Place  (sentiment:  0.152429 )\",\n      \"language\": \"shell\",\n      \"name\": \"Output\"\n    }\n  ]\n}\n[/block]\nAs we can see, the first document had a negative sentiment of `-0.36575`; the second document had a positive sentiment of `0.54` while the third document had a neutral sentiment of `0.1829`.\n\nCool! You've just completed your first Semantria analysis. For a line-by-line explanation of all Detailed Mode analysis terms, check out our Detailed Mode Quick Reference.\n\nDetailed Mode is very useful for one-by-one analysis, but sometimes we want overall trends across many documents. That's where Discovery Mode comes in.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Discovery mode test app\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"In a Discovery analysis, we can process multiple documents in one analysis and see comprehensive trends.\"\n}\n[/block]\n**Step 1:** First, we'll need some source text. Here we have a collection of tweets we want to analyze in a file called source.txt. [Download the file](https://semantria.com/files/source.txt) and put it in the same directory as `detailed_test_app.py`.\n\n**Step 2:** To make a Discovery analysis test app, open a file called `discovery_test_app.py` (in the same directory as `source.txt`) and add the following lines:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"from __future__ import print_function, unicode_literals\\nimport sys\\nimport time\\nimport uuid\\nimport semantria\",\n      \"language\": \"python\",\n      \"name\": \"discovery_test_app.py\"\n    }\n  ]\n}\n[/block]\nStep 3: Now we'll write some code that opens and reads the source text and adds it to our variable `docs`.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"if __name__ == \\\"__main__\\\":\\n   print(\\\"Semantria Collection processing mode demo.\\\")\\n\\n   docs = []\\n   print(\\\"Reading collection from file...\\\")\\n   for line in open('source.txt'):\\n      docs.append(line)\\n\\n   if len(docs) < 1:\\n      print(\\\"Source file isn't available or blank.\\\")\\n      sys.exit(1)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 4:** Next we'll add our API key and secret (to get a Key and Secret, [sign up for a free trial](https://semantria.com/signup)).\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"      session = semantria.Session(\\\"key00000-0000-0000-0000-000000000000\\\", \\\"secret0-0000-0000-0000-000000000000\\\", use_compression=True)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 5:** Next we generate a unique collection ID and queue the collection to be processed.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"   collection_id = str(uuid.uuid4())\\n   status = session.queueCollection({\\\"id\\\": collection_id, \\\"documents\\\": docs})\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 6:** In the Semantria API a 200 status means the collection has been queued and processed correctly and the server will likely respond with your processed data. A 202 status means the collection has been queued correctly (and the documents may still be processing). These are the only two HTTP statuses we want when queueing collections. A complete list of HTTP status messages can be found here.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"   if status != 200 and status != 202:\\n      print(\\\"Error:\\\")\\n      print(status)\\n      sys.exit(1)\\n\\n   print(\\\"%s collection queued successfully.\\\" % collection_id)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 7:** Now we will retrieve our queued collection.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"   result = None\\n   while True:\\n      time.sleep(1)\\n      print(\\\"Retrieving your processed results...\\\")\\n      result = session.getCollection(collection_id)\\n      if result['status'] != 'QUEUED':\\n         break\\n   if result['status'] != 'PROCESSED':\\n      print(\\\"Error:\\\")\\n      print(results['status'])\\n      sys.exit(1)\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\n**Step 8:** The final step is printing our processed collection.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"   print(\\\"\\\")\\n   for facet in result['facets']:\\n       print(\\\"%s : %s\\\" % (facet['label'], facet['count']))\",\n      \"language\": \"python\"\n    }\n  ]\n}\n[/block]\nCongratulations! You've completed your first Semantria Discovery analysis! Here is the expected output for `discovery_test_app.py`:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"Semantria Collection processing mode demo.\\nReading collection from file...\\nd72b5994-a914-49be-b5a3-7c9e3cfc3e87 collection queued successfully.\\nRetrieving your processed results...\\nRetrieving your processed results...\\n\\nMayor Rob Ford : 2\\nreferences : 2\\nreferrals : 2\\ncourse : 2\\nmen : 2\\nblunder : 2\\nupdate : 1\\nTuesday : 1\\nPlace : 1\\nremains : 1\\nreply : 1\\nrally : 1\\nactor : 1\\ncrash : 1\\nOscar : 1\",\n      \"language\": \"shell\",\n      \"name\": \"Output\"\n    }\n  ]\n}\n[/block]","excerpt":"This is a slightly more worked out example of the code you saw earlier that demonstrates using IDs with documents as well as checking for status codes on submission.\n\nThis quickstart will guide you through your first Semantria analysis. We will create a Detailed mode test app and a Discovery mode test app. We'll use Python to create this example, but many other languages could be used. We have SDKs for C, Java, .NET, PHP, Python, Ruby, and JavaScript.","slug":"quick-start-with-python","type":"basic","title":"Quick Start with Python","__v":0,"childrenPages":[]}

Quick Start with Python

This is a slightly more worked out example of the code you saw earlier that demonstrates using IDs with documents as well as checking for status codes on submission. This quickstart will guide you through your first Semantria analysis. We will create a Detailed mode test app and a Discovery mode test app. We'll use Python to create this example, but many other languages could be used. We have SDKs for C, Java, .NET, PHP, Python, Ruby, and JavaScript.

[block:api-header] { "type": "basic", "title": "Detailed mode test app" } [/block] **Step 1:** Install the SDK. If you have pip, just type `pip install semantria_sdk` in the command line (if you don't have it, [download pip](https://pypi.python.org/pypi/pip)). [block:callout] { "type": "info", "body": "There are two kinds of analysis in the Semantria API: Detailed analysis and Discovery analysis. Detailed analysis processes individual documents (whether you send one or many documents) and returns the sentiment score, themes, and entities of each document. Discovery analysis processes documents as one collection and returns the facets and attributes that appear across many documents." } [/block] **Step 2:** Create a file called `detailed_test_app.py` and add the following lines: [block:code] { "codes": [ { "code": "from __future__ import print_function\nimport semantria\nimport uuid\nimport time\n\nserializer = semantria.JsonSerializer()", "language": "python", "name": "detailed_test_app.py" } ] } [/block] **Step 3:** Next, let's start the Semantria session. You'll need to input your API Key and Secret (to get a Key and Secret, sign up for a free trial here).You can create variables `semantria_key` and `semantria_secret` or input them as strings. In this example, we did not use variables. [block:code] { "codes": [ { "code": "session = semantria.Session(\"key00000-0000-0000-0000-000000000000\", \"secret0-0000-0000-0000-000000000000\", serializer, use_compression=True)", "language": "python" } ] } [/block] **Step 4:** Now we'll write some text to analyze. We used these three examples, but feel free to make up your own. [block:code] { "codes": [ { "code": "initialTexts = [\n \"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\",\n \"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream.\",\n \"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\"\n]", "language": "python" } ] } [/block] **Step 5:** Next, we'll create unique document IDs for each text sample we just wrote. [block:code] { "codes": [ { "code": "for text in initialTexts:\n doc = {\"id\": str(uuid.uuid4()).replace(\"-\", \"\"), \"text\": text}", "language": "python" } ] } [/block] **Step 6**: Now we will queue our documents. Since this is an HTTP API, every request will return an HTTP status. Semantria has a whole list of custom HTTP statuses but for now, we are just checking if our documents have been queued properly (which will be a 202 status). [block:code] { "codes": [ { "code": " status = session.queueDocument(doc)\n if status == 202:\n print(\"\\\"\", doc[\"id\"], \"\\\" document queued successfully.\", \"\\r\\n\")", "language": "python" } ] } [/block] **Step 7:** Semantria requires one call for queueing and another for requesting the processed documents. In this toy example, we will loop through until all our documents have been processed. We check every two seconds if there are new documents and add them to our results until all documents are processed. [block:code] { "codes": [ { "code": "length = len(initialTexts)\nresults = []\n\nwhile len(results) < length:\n print(\"Retrieving your processed results...\", \"\\r\\n\")\n time.sleep(2)\n # get processed documents\n status = session.getProcessedDocuments()\n results.extend(status)", "language": "python" } ] } [/block] [block:callout] { "type": "danger", "body": "In a real application, you will want to have one job for queueing and another for requesting. `time.sleep()` will stop the entire program, so it will stop queueing and requesting; don't use it in your real application!" } [/block] **Step 8:** Now we will print our processed results. It will print the document sentiment score, themes, entities, and their accompanying sentiment scores for each document. [block:code] { "codes": [ { "code": "for data in results:\n # print document sentiment score\n print(\"Document \", data[\"id\"], \" Sentiment score: \", data[\"sentiment_score\"], \"\\r\\n\")\n\n # print document themes\n if \"themes\" in data:\n print(\"Document themes:\", \"\\r\\n\")\n for theme in data[\"themes\"]:\n print(\" \", theme[\"title\"], \" (sentiment: \", theme[\"sentiment_score\"], \")\", \"\\r\\n\")\n\n # print document entities\n if \"entities\" in data:\n print(\"Entities:\", \"\\r\\n\")\n for entity in data[\"entities\"]:\n print(\"\\t\", entity[\"title\"], \" : \", entity[\"entity_type\"],\" (sentiment: \", entity[\"sentiment_score\"], \")\", \"\\r\\n\")", "language": "python" } ] } [/block] That's it! Here is the expected output for `detailed_test_app.py`: [block:code] { "codes": [ { "code": "\" 64f5abc2fe604890a6e730ad0b8e3ff6 \" document queued successfully.\n\" 3c5231606f874770b340393eb124b8cd \" document queued successfully.\n\" 94d304bf9eb44f28975f5113d394efea \" document queued successfully.\n\nRetrieving your processed results...\n\nDocument 64f5abc2fe604890a6e730ad0b8e3ff6 Sentiment score: -0.36575\nDocument themes:\n skinny cow ice cream coupons (sentiment: -0.3 )\n Skinny cow coupons (sentiment: -0.3 )\n tiny cup (sentiment: -0.36071876 )\n\nDocument 3c5231606f874770b340393eb124b8cd Sentiment score: 0.54\nDocument themes:\n guided walk (sentiment: 0.57000005 )\n short walks (sentiment: 0.52500004 )\n Banff Gondola (sentiment: 0.26250002 )\n candy shops (sentiment: 0.13125001 )\n ice cream (sentiment: 0.065625004 )\nEntities:\n\t Lake Louise : Place (sentiment: 1.2 )\n\t Banff Avenue : Place (sentiment: 0.4 )\n\t Moraine Lake : Place (sentiment: 0.6 )\n\t Agnes Tea House : Place (sentiment: 0.6 )\n\t Fairmont Banff Springs Hotel : Company (sentiment: 0.0 )\n\nDocument 94d304bf9eb44f28975f5113d394efea Sentiment score: 0.18291481\nDocument themes:\n commercial ice cream (sentiment: 0.18291481 )\nEntities:\n\t New York City : Place (sentiment: 0.152429 )", "language": "shell", "name": "Output" } ] } [/block] As we can see, the first document had a negative sentiment of `-0.36575`; the second document had a positive sentiment of `0.54` while the third document had a neutral sentiment of `0.1829`. Cool! You've just completed your first Semantria analysis. For a line-by-line explanation of all Detailed Mode analysis terms, check out our Detailed Mode Quick Reference. Detailed Mode is very useful for one-by-one analysis, but sometimes we want overall trends across many documents. That's where Discovery Mode comes in. [block:api-header] { "type": "basic", "title": "Discovery mode test app" } [/block] [block:callout] { "type": "info", "body": "In a Discovery analysis, we can process multiple documents in one analysis and see comprehensive trends." } [/block] **Step 1:** First, we'll need some source text. Here we have a collection of tweets we want to analyze in a file called source.txt. [Download the file](https://semantria.com/files/source.txt) and put it in the same directory as `detailed_test_app.py`. **Step 2:** To make a Discovery analysis test app, open a file called `discovery_test_app.py` (in the same directory as `source.txt`) and add the following lines: [block:code] { "codes": [ { "code": "from __future__ import print_function, unicode_literals\nimport sys\nimport time\nimport uuid\nimport semantria", "language": "python", "name": "discovery_test_app.py" } ] } [/block] Step 3: Now we'll write some code that opens and reads the source text and adds it to our variable `docs`. [block:code] { "codes": [ { "code": "if __name__ == \"__main__\":\n print(\"Semantria Collection processing mode demo.\")\n\n docs = []\n print(\"Reading collection from file...\")\n for line in open('source.txt'):\n docs.append(line)\n\n if len(docs) < 1:\n print(\"Source file isn't available or blank.\")\n sys.exit(1)", "language": "python" } ] } [/block] **Step 4:** Next we'll add our API key and secret (to get a Key and Secret, [sign up for a free trial](https://semantria.com/signup)). [block:code] { "codes": [ { "code": " session = semantria.Session(\"key00000-0000-0000-0000-000000000000\", \"secret0-0000-0000-0000-000000000000\", use_compression=True)", "language": "python" } ] } [/block] **Step 5:** Next we generate a unique collection ID and queue the collection to be processed. [block:code] { "codes": [ { "code": " collection_id = str(uuid.uuid4())\n status = session.queueCollection({\"id\": collection_id, \"documents\": docs})", "language": "python" } ] } [/block] **Step 6:** In the Semantria API a 200 status means the collection has been queued and processed correctly and the server will likely respond with your processed data. A 202 status means the collection has been queued correctly (and the documents may still be processing). These are the only two HTTP statuses we want when queueing collections. A complete list of HTTP status messages can be found here. [block:code] { "codes": [ { "code": " if status != 200 and status != 202:\n print(\"Error:\")\n print(status)\n sys.exit(1)\n\n print(\"%s collection queued successfully.\" % collection_id)", "language": "python" } ] } [/block] **Step 7:** Now we will retrieve our queued collection. [block:code] { "codes": [ { "code": " result = None\n while True:\n time.sleep(1)\n print(\"Retrieving your processed results...\")\n result = session.getCollection(collection_id)\n if result['status'] != 'QUEUED':\n break\n if result['status'] != 'PROCESSED':\n print(\"Error:\")\n print(results['status'])\n sys.exit(1)", "language": "python" } ] } [/block] **Step 8:** The final step is printing our processed collection. [block:code] { "codes": [ { "code": " print(\"\")\n for facet in result['facets']:\n print(\"%s : %s\" % (facet['label'], facet['count']))", "language": "python" } ] } [/block] Congratulations! You've completed your first Semantria Discovery analysis! Here is the expected output for `discovery_test_app.py`: [block:code] { "codes": [ { "code": "Semantria Collection processing mode demo.\nReading collection from file...\nd72b5994-a914-49be-b5a3-7c9e3cfc3e87 collection queued successfully.\nRetrieving your processed results...\nRetrieving your processed results...\n\nMayor Rob Ford : 2\nreferences : 2\nreferrals : 2\ncourse : 2\nmen : 2\nblunder : 2\nupdate : 1\nTuesday : 1\nPlace : 1\nremains : 1\nreply : 1\nrally : 1\nactor : 1\ncrash : 1\nOscar : 1", "language": "shell", "name": "Output" } ] } [/block]
[block:api-header] { "type": "basic", "title": "Detailed mode test app" } [/block] **Step 1:** Install the SDK. If you have pip, just type `pip install semantria_sdk` in the command line (if you don't have it, [download pip](https://pypi.python.org/pypi/pip)). [block:callout] { "type": "info", "body": "There are two kinds of analysis in the Semantria API: Detailed analysis and Discovery analysis. Detailed analysis processes individual documents (whether you send one or many documents) and returns the sentiment score, themes, and entities of each document. Discovery analysis processes documents as one collection and returns the facets and attributes that appear across many documents." } [/block] **Step 2:** Create a file called `detailed_test_app.py` and add the following lines: [block:code] { "codes": [ { "code": "from __future__ import print_function\nimport semantria\nimport uuid\nimport time\n\nserializer = semantria.JsonSerializer()", "language": "python", "name": "detailed_test_app.py" } ] } [/block] **Step 3:** Next, let's start the Semantria session. You'll need to input your API Key and Secret (to get a Key and Secret, sign up for a free trial here).You can create variables `semantria_key` and `semantria_secret` or input them as strings. In this example, we did not use variables. [block:code] { "codes": [ { "code": "session = semantria.Session(\"key00000-0000-0000-0000-000000000000\", \"secret0-0000-0000-0000-000000000000\", serializer, use_compression=True)", "language": "python" } ] } [/block] **Step 4:** Now we'll write some text to analyze. We used these three examples, but feel free to make up your own. [block:code] { "codes": [ { "code": "initialTexts = [\n \"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol\",\n \"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \\\"must-do\\\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream.\",\n \"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.\"\n]", "language": "python" } ] } [/block] **Step 5:** Next, we'll create unique document IDs for each text sample we just wrote. [block:code] { "codes": [ { "code": "for text in initialTexts:\n doc = {\"id\": str(uuid.uuid4()).replace(\"-\", \"\"), \"text\": text}", "language": "python" } ] } [/block] **Step 6**: Now we will queue our documents. Since this is an HTTP API, every request will return an HTTP status. Semantria has a whole list of custom HTTP statuses but for now, we are just checking if our documents have been queued properly (which will be a 202 status). [block:code] { "codes": [ { "code": " status = session.queueDocument(doc)\n if status == 202:\n print(\"\\\"\", doc[\"id\"], \"\\\" document queued successfully.\", \"\\r\\n\")", "language": "python" } ] } [/block] **Step 7:** Semantria requires one call for queueing and another for requesting the processed documents. In this toy example, we will loop through until all our documents have been processed. We check every two seconds if there are new documents and add them to our results until all documents are processed. [block:code] { "codes": [ { "code": "length = len(initialTexts)\nresults = []\n\nwhile len(results) < length:\n print(\"Retrieving your processed results...\", \"\\r\\n\")\n time.sleep(2)\n # get processed documents\n status = session.getProcessedDocuments()\n results.extend(status)", "language": "python" } ] } [/block] [block:callout] { "type": "danger", "body": "In a real application, you will want to have one job for queueing and another for requesting. `time.sleep()` will stop the entire program, so it will stop queueing and requesting; don't use it in your real application!" } [/block] **Step 8:** Now we will print our processed results. It will print the document sentiment score, themes, entities, and their accompanying sentiment scores for each document. [block:code] { "codes": [ { "code": "for data in results:\n # print document sentiment score\n print(\"Document \", data[\"id\"], \" Sentiment score: \", data[\"sentiment_score\"], \"\\r\\n\")\n\n # print document themes\n if \"themes\" in data:\n print(\"Document themes:\", \"\\r\\n\")\n for theme in data[\"themes\"]:\n print(\" \", theme[\"title\"], \" (sentiment: \", theme[\"sentiment_score\"], \")\", \"\\r\\n\")\n\n # print document entities\n if \"entities\" in data:\n print(\"Entities:\", \"\\r\\n\")\n for entity in data[\"entities\"]:\n print(\"\\t\", entity[\"title\"], \" : \", entity[\"entity_type\"],\" (sentiment: \", entity[\"sentiment_score\"], \")\", \"\\r\\n\")", "language": "python" } ] } [/block] That's it! Here is the expected output for `detailed_test_app.py`: [block:code] { "codes": [ { "code": "\" 64f5abc2fe604890a6e730ad0b8e3ff6 \" document queued successfully.\n\" 3c5231606f874770b340393eb124b8cd \" document queued successfully.\n\" 94d304bf9eb44f28975f5113d394efea \" document queued successfully.\n\nRetrieving your processed results...\n\nDocument 64f5abc2fe604890a6e730ad0b8e3ff6 Sentiment score: -0.36575\nDocument themes:\n skinny cow ice cream coupons (sentiment: -0.3 )\n Skinny cow coupons (sentiment: -0.3 )\n tiny cup (sentiment: -0.36071876 )\n\nDocument 3c5231606f874770b340393eb124b8cd Sentiment score: 0.54\nDocument themes:\n guided walk (sentiment: 0.57000005 )\n short walks (sentiment: 0.52500004 )\n Banff Gondola (sentiment: 0.26250002 )\n candy shops (sentiment: 0.13125001 )\n ice cream (sentiment: 0.065625004 )\nEntities:\n\t Lake Louise : Place (sentiment: 1.2 )\n\t Banff Avenue : Place (sentiment: 0.4 )\n\t Moraine Lake : Place (sentiment: 0.6 )\n\t Agnes Tea House : Place (sentiment: 0.6 )\n\t Fairmont Banff Springs Hotel : Company (sentiment: 0.0 )\n\nDocument 94d304bf9eb44f28975f5113d394efea Sentiment score: 0.18291481\nDocument themes:\n commercial ice cream (sentiment: 0.18291481 )\nEntities:\n\t New York City : Place (sentiment: 0.152429 )", "language": "shell", "name": "Output" } ] } [/block] As we can see, the first document had a negative sentiment of `-0.36575`; the second document had a positive sentiment of `0.54` while the third document had a neutral sentiment of `0.1829`. Cool! You've just completed your first Semantria analysis. For a line-by-line explanation of all Detailed Mode analysis terms, check out our Detailed Mode Quick Reference. Detailed Mode is very useful for one-by-one analysis, but sometimes we want overall trends across many documents. That's where Discovery Mode comes in. [block:api-header] { "type": "basic", "title": "Discovery mode test app" } [/block] [block:callout] { "type": "info", "body": "In a Discovery analysis, we can process multiple documents in one analysis and see comprehensive trends." } [/block] **Step 1:** First, we'll need some source text. Here we have a collection of tweets we want to analyze in a file called source.txt. [Download the file](https://semantria.com/files/source.txt) and put it in the same directory as `detailed_test_app.py`. **Step 2:** To make a Discovery analysis test app, open a file called `discovery_test_app.py` (in the same directory as `source.txt`) and add the following lines: [block:code] { "codes": [ { "code": "from __future__ import print_function, unicode_literals\nimport sys\nimport time\nimport uuid\nimport semantria", "language": "python", "name": "discovery_test_app.py" } ] } [/block] Step 3: Now we'll write some code that opens and reads the source text and adds it to our variable `docs`. [block:code] { "codes": [ { "code": "if __name__ == \"__main__\":\n print(\"Semantria Collection processing mode demo.\")\n\n docs = []\n print(\"Reading collection from file...\")\n for line in open('source.txt'):\n docs.append(line)\n\n if len(docs) < 1:\n print(\"Source file isn't available or blank.\")\n sys.exit(1)", "language": "python" } ] } [/block] **Step 4:** Next we'll add our API key and secret (to get a Key and Secret, [sign up for a free trial](https://semantria.com/signup)). [block:code] { "codes": [ { "code": " session = semantria.Session(\"key00000-0000-0000-0000-000000000000\", \"secret0-0000-0000-0000-000000000000\", use_compression=True)", "language": "python" } ] } [/block] **Step 5:** Next we generate a unique collection ID and queue the collection to be processed. [block:code] { "codes": [ { "code": " collection_id = str(uuid.uuid4())\n status = session.queueCollection({\"id\": collection_id, \"documents\": docs})", "language": "python" } ] } [/block] **Step 6:** In the Semantria API a 200 status means the collection has been queued and processed correctly and the server will likely respond with your processed data. A 202 status means the collection has been queued correctly (and the documents may still be processing). These are the only two HTTP statuses we want when queueing collections. A complete list of HTTP status messages can be found here. [block:code] { "codes": [ { "code": " if status != 200 and status != 202:\n print(\"Error:\")\n print(status)\n sys.exit(1)\n\n print(\"%s collection queued successfully.\" % collection_id)", "language": "python" } ] } [/block] **Step 7:** Now we will retrieve our queued collection. [block:code] { "codes": [ { "code": " result = None\n while True:\n time.sleep(1)\n print(\"Retrieving your processed results...\")\n result = session.getCollection(collection_id)\n if result['status'] != 'QUEUED':\n break\n if result['status'] != 'PROCESSED':\n print(\"Error:\")\n print(results['status'])\n sys.exit(1)", "language": "python" } ] } [/block] **Step 8:** The final step is printing our processed collection. [block:code] { "codes": [ { "code": " print(\"\")\n for facet in result['facets']:\n print(\"%s : %s\" % (facet['label'], facet['count']))", "language": "python" } ] } [/block] Congratulations! You've completed your first Semantria Discovery analysis! Here is the expected output for `discovery_test_app.py`: [block:code] { "codes": [ { "code": "Semantria Collection processing mode demo.\nReading collection from file...\nd72b5994-a914-49be-b5a3-7c9e3cfc3e87 collection queued successfully.\nRetrieving your processed results...\nRetrieving your processed results...\n\nMayor Rob Ford : 2\nreferences : 2\nreferrals : 2\ncourse : 2\nmen : 2\nblunder : 2\nupdate : 1\nTuesday : 1\nPlace : 1\nremains : 1\nreply : 1\nrally : 1\nactor : 1\ncrash : 1\nOscar : 1", "language": "shell", "name": "Output" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ec","createdAt":"2015-09-17T17:42:22.289Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":24,"body":"**Step 1:** Install the SDK. \n\n** Step 2:** Create a file of texts to analyze, one text per line, called source.txt\n\n** Step 3:** Enter the following code in your favorite editor. This just gets us set up to create the Semantria objects we need.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"using System;\\nusing System.Collections.Generic;\\nusing System.IO;\\n\\nusing Semantria.Com;\\nusing Semantria.Com.Serializers;\\nusing Semantria.Com.Mapping;\\nusing Semantria.Com.Mapping.Output;\\n\\nusing System.Linq;\\nusing Semantria.Com.Mapping.Configuration;\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n** Step 4:** Create the main namespace and set our credentials. Insert your key and secret in the blank strings.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"namespace DetailedModeTestApp\\n{\\n    class Program\\n    {\\n        static void Main(string[] args)\\n        {\\n            // Use correct Semantria API credentias here\\n            string consumerKey = \\\"\\\";\\n            string consumerSecret = \\\"\\\";\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n\n** Step 5:** Read in the texts to analyze to a List, and also create a dictionary of document IDs so we can check and make sure we got everything back\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"  // A dictionary that keeps IDs of sent documents and their statuses. It's required to make sure that we get correct documents from the API.\\n            Dictionary<string, TaskStatus> docsTracker = new Dictionary<string, TaskStatus>(4);\\n            List<string> initialTexts = new List<string>();\\n\\n            Console.WriteLine(\\\"Semantria Detailed processing mode demo.\\\");\\n\\n            string path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, \\\"source.txt\\\");\\n            if (!File.Exists(path))\\n            {\\n                Console.WriteLine(\\\"Source file isn't available.\\\");\\n                return;\\n            }\\n\\n            //Reads collection from the source file\\n            Console.WriteLine(\\\"Reading dataset from file...\\\");\\n            using (StreamReader stream = new StreamReader(path))\\n            {\\n                while (!stream.EndOfStream)\\n                {\\n                    string line = stream.ReadLine();\\n                    if (string.IsNullOrEmpty(line) || line.Length < 3)\\n                        continue;\\n\\n                    initialTexts.Add(line);\\n                }\\n            }\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n          ** Step 6:** Start a Semantria session, using credentials set above, with an error handler.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"  // Creates JSON serializer instance\\n\\t\\t\\tISerializer serializer = new JsonSerializer();\\n\\n            // Initializes new session with the serializer object and the keys.\\n            using (Session session = Session.CreateSession(consumerKey, consumerSecret, serializer))\\n            {\\n                // Error callback handler. This event will occur in case of server-side error\\n                session.Error += new Session.ErrorHandler(delegate(object sender, ResponseErrorEventArgs ea)\\n                {\\n                    Console.WriteLine(string.Format(\\\"{0}: {1}\\\", (int)ea.Status, ea.Message));\\n                });\\n\\n                //Obtaining subscription object to get user limits applicable on server side\\n                Subscription subscription = session.GetSubscription();\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n    **Step 6:**Create a batch of content to analyze. We create unique IDs for each doc and add to our dictionary. If you already have unique IDs in your content system you can use those, or you can use none at all.      \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \" List<Document> outgoingBatch = new List<Document>(subscription.BasicSettings.BatchLimit);\\n                IEnumerator<string> iterrator = initialTexts.GetEnumerator();\\n                while (iterrator.MoveNext())\\n                {\\n                    string docId = Guid.NewGuid().ToString();\\n                    Document doc = new Document()\\n                    {\\n                        Id = docId,\\n                        Text = iterrator.Current\\n                    };\\n\\n                    outgoingBatch.Add(doc);\\n                    docsTracker.Add(docId, TaskStatus.QUEUED);\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n          \n         **Step 7:** Check to see what our batch limit is in our subscription and make sure our batch of content isn't bigger than that. Once we're at the limit, or at the end of our texts, queue the batch to Semantria for processing. \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"                    if (outgoingBatch.Count == subscription.BasicSettings.BatchLimit)\\n                    {\\n                        // Queues batch of documents for processing on Semantria service\\n                        if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\\n                        {\\n                            Console.WriteLine(string.Format(\\\"{0} documents queued successfully.\\\", outgoingBatch.Count));\\n                            outgoingBatch.Clear();\\n                        }\\n                    }\\n                }\\n\\n                if (outgoingBatch.Count > 0)\\n                {\\n                    // Queues batch of documents for processing on Semantria service\\n                    if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\\n                    {\\n                        Console.WriteLine(string.Format(\\\"{0} documents queued successfully.\\\", outgoingBatch.Count));\\n                    }\\n                }\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n**Step 8:** Wait. Start checking for results from Semantria.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"                Console.WriteLine();\\n\\n                // As Semantria isn't a real-time solution you need to wait a bit  before getting the processed results\\n                // In a real application you would want two separate jobs, one for queuing source data another one for retreiving\\n                // Wait ten seconds while Semantria processes queued document\\n                List<DocAnalyticData> results = new List<DocAnalyticData>();\\n                while (docsTracker.Any(item => item.Value == TaskStatus.QUEUED))\\n                {\\n                    System.Threading.Thread.Sleep(500);\\n\\n                    // Requests processed results from Semantria service\\n                    Console.WriteLine(\\\"Retrieving your processed results...\\\");\\n                    IList<DocAnalyticData> incomingBatch = session.GetProcessedDocuments();\\n\\n                    foreach (DocAnalyticData item in incomingBatch)\\n                    {\\n                        if (docsTracker.ContainsKey(item.Id))\\n                        {\\n                            docsTracker[item.Id] = item.Status;\\n                            results.Add(item);\\n                        }\\n                    }\\n                }\\n                Console.WriteLine();\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]\n**Step 9:** Print out the results for review.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"                foreach (DocAnalyticData data in results)\\n                {\\n                    // Printing of document sentiment score\\n                    Console.WriteLine(string.Format(\\\"Document {0}. Sentiment score: {1}\\\", data.Id, data.SentimentScore));\\n\\n                    // Printing of intentions\\n                    if (data.AutoCategories != null && data.AutoCategories.Count > 0)\\n                    {\\n                        Console.WriteLine(\\\"Document categories:\\\");\\n                        foreach (DocCategory category in data.AutoCategories)\\n                            Console.WriteLine(string.Format(\\\"\\\\t{0} (strength: {1})\\\", category.Title, category.StrengthScore));\\n                    }\\n\\n                    // Printing of document themes\\n                    if (data.Themes != null && data.Themes.Count > 0)\\n                    {\\n                        Console.WriteLine(\\\"Document themes:\\\");\\n                        foreach (DocTheme theme in data.Themes)\\n                            Console.WriteLine(string.Format(\\\"\\\\t{0} (sentiment: {1})\\\", theme.Title, theme.SentimentScore));\\n                    }\\n\\n                    // Printing of document entities\\n                    if (data.Entities != null && data.Entities.Count > 0)\\n                    {\\n                        Console.WriteLine(\\\"Entities:\\\");\\n                        foreach (DocEntity entity in data.Entities)\\n                            Console.WriteLine(string.Format(\\\"\\\\t{0} : {1} (sentiment: {2})\\\", entity.Title, entity.EntityType, entity.SentimentScore));\\n                    }\\n\\n                    Console.WriteLine();\\n                }\\n            }\\n\\n            Console.ReadKey(false);\\n        }\\n    }\\n}\",\n      \"language\": \"csharp\"\n    }\n  ]\n}\n[/block]","excerpt":"This is a slightly more worked out example of the code you saw earlier that demonstrates using IDs with documents as well as checking for status codes on submission. This example using the polling method to get results back from Semantria.\n\nThis quickstart will guide you through your first Semantria analysis. We will create a Detailed mode test app. We'll use .NET to create this example, but many other languages could be used. We have SDKs for C, Java, PHP, Python, Ruby, and node.js.","slug":"quick-start-with-net","type":"basic","title":"Quick Start with .NET","__v":0,"childrenPages":[]}

Quick Start with .NET

This is a slightly more worked out example of the code you saw earlier that demonstrates using IDs with documents as well as checking for status codes on submission. This example using the polling method to get results back from Semantria. This quickstart will guide you through your first Semantria analysis. We will create a Detailed mode test app. We'll use .NET to create this example, but many other languages could be used. We have SDKs for C, Java, PHP, Python, Ruby, and node.js.

**Step 1:** Install the SDK. ** Step 2:** Create a file of texts to analyze, one text per line, called source.txt ** Step 3:** Enter the following code in your favorite editor. This just gets us set up to create the Semantria objects we need. [block:code] { "codes": [ { "code": "using System;\nusing System.Collections.Generic;\nusing System.IO;\n\nusing Semantria.Com;\nusing Semantria.Com.Serializers;\nusing Semantria.Com.Mapping;\nusing Semantria.Com.Mapping.Output;\n\nusing System.Linq;\nusing Semantria.Com.Mapping.Configuration;", "language": "csharp" } ] } [/block] ** Step 4:** Create the main namespace and set our credentials. Insert your key and secret in the blank strings. [block:code] { "codes": [ { "code": "namespace DetailedModeTestApp\n{\n class Program\n {\n static void Main(string[] args)\n {\n // Use correct Semantria API credentias here\n string consumerKey = \"\";\n string consumerSecret = \"\";", "language": "csharp" } ] } [/block] ** Step 5:** Read in the texts to analyze to a List, and also create a dictionary of document IDs so we can check and make sure we got everything back [block:code] { "codes": [ { "code": " // A dictionary that keeps IDs of sent documents and their statuses. It's required to make sure that we get correct documents from the API.\n Dictionary<string, TaskStatus> docsTracker = new Dictionary<string, TaskStatus>(4);\n List<string> initialTexts = new List<string>();\n\n Console.WriteLine(\"Semantria Detailed processing mode demo.\");\n\n string path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, \"source.txt\");\n if (!File.Exists(path))\n {\n Console.WriteLine(\"Source file isn't available.\");\n return;\n }\n\n //Reads collection from the source file\n Console.WriteLine(\"Reading dataset from file...\");\n using (StreamReader stream = new StreamReader(path))\n {\n while (!stream.EndOfStream)\n {\n string line = stream.ReadLine();\n if (string.IsNullOrEmpty(line) || line.Length < 3)\n continue;\n\n initialTexts.Add(line);\n }\n }", "language": "csharp" } ] } [/block] ** Step 6:** Start a Semantria session, using credentials set above, with an error handler. [block:code] { "codes": [ { "code": " // Creates JSON serializer instance\n\t\t\tISerializer serializer = new JsonSerializer();\n\n // Initializes new session with the serializer object and the keys.\n using (Session session = Session.CreateSession(consumerKey, consumerSecret, serializer))\n {\n // Error callback handler. This event will occur in case of server-side error\n session.Error += new Session.ErrorHandler(delegate(object sender, ResponseErrorEventArgs ea)\n {\n Console.WriteLine(string.Format(\"{0}: {1}\", (int)ea.Status, ea.Message));\n });\n\n //Obtaining subscription object to get user limits applicable on server side\n Subscription subscription = session.GetSubscription();", "language": "csharp" } ] } [/block] **Step 6:**Create a batch of content to analyze. We create unique IDs for each doc and add to our dictionary. If you already have unique IDs in your content system you can use those, or you can use none at all. [block:code] { "codes": [ { "code": " List<Document> outgoingBatch = new List<Document>(subscription.BasicSettings.BatchLimit);\n IEnumerator<string> iterrator = initialTexts.GetEnumerator();\n while (iterrator.MoveNext())\n {\n string docId = Guid.NewGuid().ToString();\n Document doc = new Document()\n {\n Id = docId,\n Text = iterrator.Current\n };\n\n outgoingBatch.Add(doc);\n docsTracker.Add(docId, TaskStatus.QUEUED);", "language": "csharp" } ] } [/block] **Step 7:** Check to see what our batch limit is in our subscription and make sure our batch of content isn't bigger than that. Once we're at the limit, or at the end of our texts, queue the batch to Semantria for processing. [block:code] { "codes": [ { "code": " if (outgoingBatch.Count == subscription.BasicSettings.BatchLimit)\n {\n // Queues batch of documents for processing on Semantria service\n if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\n {\n Console.WriteLine(string.Format(\"{0} documents queued successfully.\", outgoingBatch.Count));\n outgoingBatch.Clear();\n }\n }\n }\n\n if (outgoingBatch.Count > 0)\n {\n // Queues batch of documents for processing on Semantria service\n if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\n {\n Console.WriteLine(string.Format(\"{0} documents queued successfully.\", outgoingBatch.Count));\n }\n }", "language": "csharp" } ] } [/block] **Step 8:** Wait. Start checking for results from Semantria. [block:code] { "codes": [ { "code": " Console.WriteLine();\n\n // As Semantria isn't a real-time solution you need to wait a bit before getting the processed results\n // In a real application you would want two separate jobs, one for queuing source data another one for retreiving\n // Wait ten seconds while Semantria processes queued document\n List<DocAnalyticData> results = new List<DocAnalyticData>();\n while (docsTracker.Any(item => item.Value == TaskStatus.QUEUED))\n {\n System.Threading.Thread.Sleep(500);\n\n // Requests processed results from Semantria service\n Console.WriteLine(\"Retrieving your processed results...\");\n IList<DocAnalyticData> incomingBatch = session.GetProcessedDocuments();\n\n foreach (DocAnalyticData item in incomingBatch)\n {\n if (docsTracker.ContainsKey(item.Id))\n {\n docsTracker[item.Id] = item.Status;\n results.Add(item);\n }\n }\n }\n Console.WriteLine();", "language": "csharp" } ] } [/block] **Step 9:** Print out the results for review. [block:code] { "codes": [ { "code": " foreach (DocAnalyticData data in results)\n {\n // Printing of document sentiment score\n Console.WriteLine(string.Format(\"Document {0}. Sentiment score: {1}\", data.Id, data.SentimentScore));\n\n // Printing of intentions\n if (data.AutoCategories != null && data.AutoCategories.Count > 0)\n {\n Console.WriteLine(\"Document categories:\");\n foreach (DocCategory category in data.AutoCategories)\n Console.WriteLine(string.Format(\"\\t{0} (strength: {1})\", category.Title, category.StrengthScore));\n }\n\n // Printing of document themes\n if (data.Themes != null && data.Themes.Count > 0)\n {\n Console.WriteLine(\"Document themes:\");\n foreach (DocTheme theme in data.Themes)\n Console.WriteLine(string.Format(\"\\t{0} (sentiment: {1})\", theme.Title, theme.SentimentScore));\n }\n\n // Printing of document entities\n if (data.Entities != null && data.Entities.Count > 0)\n {\n Console.WriteLine(\"Entities:\");\n foreach (DocEntity entity in data.Entities)\n Console.WriteLine(string.Format(\"\\t{0} : {1} (sentiment: {2})\", entity.Title, entity.EntityType, entity.SentimentScore));\n }\n\n Console.WriteLine();\n }\n }\n\n Console.ReadKey(false);\n }\n }\n}", "language": "csharp" } ] } [/block]
**Step 1:** Install the SDK. ** Step 2:** Create a file of texts to analyze, one text per line, called source.txt ** Step 3:** Enter the following code in your favorite editor. This just gets us set up to create the Semantria objects we need. [block:code] { "codes": [ { "code": "using System;\nusing System.Collections.Generic;\nusing System.IO;\n\nusing Semantria.Com;\nusing Semantria.Com.Serializers;\nusing Semantria.Com.Mapping;\nusing Semantria.Com.Mapping.Output;\n\nusing System.Linq;\nusing Semantria.Com.Mapping.Configuration;", "language": "csharp" } ] } [/block] ** Step 4:** Create the main namespace and set our credentials. Insert your key and secret in the blank strings. [block:code] { "codes": [ { "code": "namespace DetailedModeTestApp\n{\n class Program\n {\n static void Main(string[] args)\n {\n // Use correct Semantria API credentias here\n string consumerKey = \"\";\n string consumerSecret = \"\";", "language": "csharp" } ] } [/block] ** Step 5:** Read in the texts to analyze to a List, and also create a dictionary of document IDs so we can check and make sure we got everything back [block:code] { "codes": [ { "code": " // A dictionary that keeps IDs of sent documents and their statuses. It's required to make sure that we get correct documents from the API.\n Dictionary<string, TaskStatus> docsTracker = new Dictionary<string, TaskStatus>(4);\n List<string> initialTexts = new List<string>();\n\n Console.WriteLine(\"Semantria Detailed processing mode demo.\");\n\n string path = Path.Combine(AppDomain.CurrentDomain.BaseDirectory, \"source.txt\");\n if (!File.Exists(path))\n {\n Console.WriteLine(\"Source file isn't available.\");\n return;\n }\n\n //Reads collection from the source file\n Console.WriteLine(\"Reading dataset from file...\");\n using (StreamReader stream = new StreamReader(path))\n {\n while (!stream.EndOfStream)\n {\n string line = stream.ReadLine();\n if (string.IsNullOrEmpty(line) || line.Length < 3)\n continue;\n\n initialTexts.Add(line);\n }\n }", "language": "csharp" } ] } [/block] ** Step 6:** Start a Semantria session, using credentials set above, with an error handler. [block:code] { "codes": [ { "code": " // Creates JSON serializer instance\n\t\t\tISerializer serializer = new JsonSerializer();\n\n // Initializes new session with the serializer object and the keys.\n using (Session session = Session.CreateSession(consumerKey, consumerSecret, serializer))\n {\n // Error callback handler. This event will occur in case of server-side error\n session.Error += new Session.ErrorHandler(delegate(object sender, ResponseErrorEventArgs ea)\n {\n Console.WriteLine(string.Format(\"{0}: {1}\", (int)ea.Status, ea.Message));\n });\n\n //Obtaining subscription object to get user limits applicable on server side\n Subscription subscription = session.GetSubscription();", "language": "csharp" } ] } [/block] **Step 6:**Create a batch of content to analyze. We create unique IDs for each doc and add to our dictionary. If you already have unique IDs in your content system you can use those, or you can use none at all. [block:code] { "codes": [ { "code": " List<Document> outgoingBatch = new List<Document>(subscription.BasicSettings.BatchLimit);\n IEnumerator<string> iterrator = initialTexts.GetEnumerator();\n while (iterrator.MoveNext())\n {\n string docId = Guid.NewGuid().ToString();\n Document doc = new Document()\n {\n Id = docId,\n Text = iterrator.Current\n };\n\n outgoingBatch.Add(doc);\n docsTracker.Add(docId, TaskStatus.QUEUED);", "language": "csharp" } ] } [/block] **Step 7:** Check to see what our batch limit is in our subscription and make sure our batch of content isn't bigger than that. Once we're at the limit, or at the end of our texts, queue the batch to Semantria for processing. [block:code] { "codes": [ { "code": " if (outgoingBatch.Count == subscription.BasicSettings.BatchLimit)\n {\n // Queues batch of documents for processing on Semantria service\n if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\n {\n Console.WriteLine(string.Format(\"{0} documents queued successfully.\", outgoingBatch.Count));\n outgoingBatch.Clear();\n }\n }\n }\n\n if (outgoingBatch.Count > 0)\n {\n // Queues batch of documents for processing on Semantria service\n if (session.QueueBatchOfDocuments(outgoingBatch) != -1)\n {\n Console.WriteLine(string.Format(\"{0} documents queued successfully.\", outgoingBatch.Count));\n }\n }", "language": "csharp" } ] } [/block] **Step 8:** Wait. Start checking for results from Semantria. [block:code] { "codes": [ { "code": " Console.WriteLine();\n\n // As Semantria isn't a real-time solution you need to wait a bit before getting the processed results\n // In a real application you would want two separate jobs, one for queuing source data another one for retreiving\n // Wait ten seconds while Semantria processes queued document\n List<DocAnalyticData> results = new List<DocAnalyticData>();\n while (docsTracker.Any(item => item.Value == TaskStatus.QUEUED))\n {\n System.Threading.Thread.Sleep(500);\n\n // Requests processed results from Semantria service\n Console.WriteLine(\"Retrieving your processed results...\");\n IList<DocAnalyticData> incomingBatch = session.GetProcessedDocuments();\n\n foreach (DocAnalyticData item in incomingBatch)\n {\n if (docsTracker.ContainsKey(item.Id))\n {\n docsTracker[item.Id] = item.Status;\n results.Add(item);\n }\n }\n }\n Console.WriteLine();", "language": "csharp" } ] } [/block] **Step 9:** Print out the results for review. [block:code] { "codes": [ { "code": " foreach (DocAnalyticData data in results)\n {\n // Printing of document sentiment score\n Console.WriteLine(string.Format(\"Document {0}. Sentiment score: {1}\", data.Id, data.SentimentScore));\n\n // Printing of intentions\n if (data.AutoCategories != null && data.AutoCategories.Count > 0)\n {\n Console.WriteLine(\"Document categories:\");\n foreach (DocCategory category in data.AutoCategories)\n Console.WriteLine(string.Format(\"\\t{0} (strength: {1})\", category.Title, category.StrengthScore));\n }\n\n // Printing of document themes\n if (data.Themes != null && data.Themes.Count > 0)\n {\n Console.WriteLine(\"Document themes:\");\n foreach (DocTheme theme in data.Themes)\n Console.WriteLine(string.Format(\"\\t{0} (sentiment: {1})\", theme.Title, theme.SentimentScore));\n }\n\n // Printing of document entities\n if (data.Entities != null && data.Entities.Count > 0)\n {\n Console.WriteLine(\"Entities:\");\n foreach (DocEntity entity in data.Entities)\n Console.WriteLine(string.Format(\"\\t{0} : {1} (sentiment: {2})\", entity.Title, entity.EntityType, entity.SentimentScore));\n }\n\n Console.WriteLine();\n }\n }\n\n Console.ReadKey(false);\n }\n }\n}", "language": "csharp" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ed","createdAt":"2015-07-16T20:27:21.705Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":25,"body":"Semantria can process content in many different languages. Each Semantria configuration has a language associated with it, and will only give meaningful results if the content submitted matches the language of the configuration.\n\nSemantria can detect the content language but does not route content based on the detected language. This is because you can create many configurations for the same language to support your needs and Semantria does not know which of those are best suited to the content you submit.\n\nYou can control whether you want to see the detected language in the returned data at a configuration level. This can be useful if you have content where you do not know the language of the individual pieces, but will cost you API credits as any other content submission does.","excerpt":"","slug":"language-identification","type":"basic","title":"Language identification and content routing","__v":0,"childrenPages":[]}

Language identification and content routing


Semantria can process content in many different languages. Each Semantria configuration has a language associated with it, and will only give meaningful results if the content submitted matches the language of the configuration. Semantria can detect the content language but does not route content based on the detected language. This is because you can create many configurations for the same language to support your needs and Semantria does not know which of those are best suited to the content you submit. You can control whether you want to see the detected language in the returned data at a configuration level. This can be useful if you have content where you do not know the language of the individual pieces, but will cost you API credits as any other content submission does.
Semantria can process content in many different languages. Each Semantria configuration has a language associated with it, and will only give meaningful results if the content submitted matches the language of the configuration. Semantria can detect the content language but does not route content based on the detected language. This is because you can create many configurations for the same language to support your needs and Semantria does not know which of those are best suited to the content you submit. You can control whether you want to see the detected language in the returned data at a configuration level. This can be useful if you have content where you do not know the language of the individual pieces, but will cost you API credits as any other content submission does.
{"category":"577e4bf24159cd1900d5d2af","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ee","createdAt":"2015-09-16T19:31:54.825Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":26,"body":"There's a lot more you can do with Semantria. Common tasks include modifying the default sentiment, creating queries to classify your documents and integrating Semantria into a production-grade content processing system.\n\n  * In How To Configure Semantria, you will see how to configure the NLP extraction - custom sentiment, queries, entities, blacklists and so on. \n  * API Integration Scenarios describes common ways to integrate Semantria into your content processing infrastructure. \n  * For each of our endpoints we have detailed documentation on how to use them to configure Semantria, and integrate with the API.\n  * Finally, we also have sample API output with explanations of each of the terms.","excerpt":"","slug":"next-steps","type":"basic","title":"Next Steps","__v":0,"childrenPages":[]}

Next Steps


There's a lot more you can do with Semantria. Common tasks include modifying the default sentiment, creating queries to classify your documents and integrating Semantria into a production-grade content processing system. * In How To Configure Semantria, you will see how to configure the NLP extraction - custom sentiment, queries, entities, blacklists and so on. * API Integration Scenarios describes common ways to integrate Semantria into your content processing infrastructure. * For each of our endpoints we have detailed documentation on how to use them to configure Semantria, and integrate with the API. * Finally, we also have sample API output with explanations of each of the terms.
There's a lot more you can do with Semantria. Common tasks include modifying the default sentiment, creating queries to classify your documents and integrating Semantria into a production-grade content processing system. * In How To Configure Semantria, you will see how to configure the NLP extraction - custom sentiment, queries, entities, blacklists and so on. * API Integration Scenarios describes common ways to integrate Semantria into your content processing infrastructure. * For each of our endpoints we have detailed documentation on how to use them to configure Semantria, and integrate with the API. * Finally, we also have sample API output with explanations of each of the terms.
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2be","createdAt":"2015-07-07T21:24:33.336Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":27,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Data processing essentials\"\n}\n[/block]\nSemantria is an asynchronous API. This means:\n  * You submit content to us and retrieve content separately.\n  * You can scale your content submission rates as you are not waiting on us to hand data back before you can submit more.\n  * You may receive data back in a different order than you submitted it. Batches of content are not necessarily preserved\n  * If you use the callback retrieval mechanism, the batches remain the same, but the order might be different\n  * If you use auto response or polling, the batch membership may also change\n  * If you have multiple machines sending and receiving content, one machine may receive a processed document that was submitted by another.\n  * Every piece of content is processed by a Semantria configuration. If you don't specify one, your default configuration will be used.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Handling Failures\"\n}\n[/block]\nThere are several types of failures during the submission and processing of content.\n1. The submission is itself invalid in some way such as invalid JSON. In this case no documents are queued, and no API credits are deducted. You need to correct the errors and resubmit. You will know the submission is invalid if you receive anything other than a 200-series HTTP status response.\n2. The submission is valid but the content itself is failed. In this case you will receive the document back, with a FAILED status and an error message stating why it was failed. Credits are deducted for this. In this case, you should not resubmit the piece of content that was failed, as it will simply fail again. The most common cases of document failure are submitting content to the wrong language (sending Arabic content to an English config for instance) and content that does not have enough text to analyze (such as ASCII art and the like)\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Keeping Track Of Your Content\"\n}\n[/block]\nBecause order and batch are not preserved on the Semantria side, it is up to you to keep track of what you submitted and received. There are several ways for you to identify your content.\n1. Each document can have a unique **id** associated with it. This is returned to you by Semantria when you receive the processed data. You can use this **id** to update the status on your side. Additionally, you can request the status of a document via its **id**.\n2. Each document can also have a tag field. This is a string field you can fill in with additional information you might use to keep track documents, such as a project ID. You can check on your side to see that you submitted 1,000 documents for tag \"my_project\" and received 1,000 documents back with that tag. You cannot request status on a tag.\n3. You can submit and retrieve by job_id. This is a string value you can set when you submit and retrieve documents. it is intended to allow you to separate out processing streams of content for routing or failover purposes on your side, not as a unique ID per batch of content.\n  * If you submit by a job_id, you must retrieve via that same job_id. Retrieving by config_id will not retrieve documents submitted with a job_id. This does not prevent you from setting the config_id when submitting with a job_id, you just cannot retrieve by that config_id.\n  * The total number of unique job_ids you use during a 24hr period must not exceed 100.\nNote\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"It is possible to send two documents with the same document id, as long as you send them to different configurations. If you send two documents with the same document id to the same configuration, the latter document sent will overwrite the former. Data loss may occur.\",\n  \"title\": \"Duplicate Document ID\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"If a user tries to process a document or analysis that has already been sent and processed but has not yet been retrieved, the server will override the previous analysis, change its status to QUEUED, and process the newer document.\"\n}\n[/block]\n\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Data Processing\"\n}\n[/block]\nThere are four types of processing a document or group of documents in a Semantria analysis.\n**Queue**: submit a document or batch of documents for Detailed analysis or submit a collection for Discovery analysis.\n  * Queue with a POST method and the server will return with an HTTP status.\n  * For example, queuing documents for analysis is like lining up bottles to be filled by the milkman.\n**Request**: determine the status of a certain document with its document ID.\n  * Request with a GET request and the server will respond with either the processed results of the current status: QUEUED, PROCESSED, or FAILED.\n  * For example, requesting a document is like asking the status of a specific bottle-- the milkman will either give you the filled bottle or tell you why you can't have it yet.\n  * In a request call, if the server responds with a \"PROCESSED\" status, it will also return the corresponding processed data.\n  * In a request call, if the server responds with a \"QUEUED\" or \"FAILED\" status, a corresponding reason or error will accompany it.\n**Retrieve**: return all processed documents.\n  * Retrieve with a GET request and the server will return the results of all documents that have been processed. It will return nothing if no documents are processed.\n  * For example, retrieving documents is like asking the milkman for any and all full bottles.\n**Cancel**: delete a queued document if Semantria has not processed it yet.\n  * This is a DELETE request.\n  * For example, cancelling a document is like removing the empty bottle before it has been filled by the milkman.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Detailed Mode Processing Methods\"\n}\n[/block]\n## Queuing Documents\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Queuing: submitting a document or batch of documents for Detailed analysis.\"\n}\n[/block]\nUsers must queue documents into the API for processing. A document can be processed with a specific configuration (by using a particular config_id) or with the default configuration (by passing nothing in the config_id field. Single documents under 2KB in size should come back in a few seconds.\n\n**One Document:**\nFor individual documents, the URL is https://api.semantria.com/document.[json|xml].\nThe **config_id** parameter should be submitted as part of the url, like in the example below. The request body should contain an XML or JSON object with three fields: the document ID, the text to be analyzed, and an optional tag in the POST request.\nAfter submitting documents to be queued, each document will be analyzed independently of the others. Semantria API will return an analysis for each document.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"The server will process each document independently of any other processes or documents. Documents will not influence each other in processing.\"\n}\n[/block]","excerpt":"","slug":"send-documents","type":"basic","title":"Processing Basics","__v":0,"childrenPages":[]}

Processing Basics


[block:api-header] { "type": "basic", "title": "Data processing essentials" } [/block] Semantria is an asynchronous API. This means: * You submit content to us and retrieve content separately. * You can scale your content submission rates as you are not waiting on us to hand data back before you can submit more. * You may receive data back in a different order than you submitted it. Batches of content are not necessarily preserved * If you use the callback retrieval mechanism, the batches remain the same, but the order might be different * If you use auto response or polling, the batch membership may also change * If you have multiple machines sending and receiving content, one machine may receive a processed document that was submitted by another. * Every piece of content is processed by a Semantria configuration. If you don't specify one, your default configuration will be used. [block:api-header] { "type": "basic", "title": "Handling Failures" } [/block] There are several types of failures during the submission and processing of content. 1. The submission is itself invalid in some way such as invalid JSON. In this case no documents are queued, and no API credits are deducted. You need to correct the errors and resubmit. You will know the submission is invalid if you receive anything other than a 200-series HTTP status response. 2. The submission is valid but the content itself is failed. In this case you will receive the document back, with a FAILED status and an error message stating why it was failed. Credits are deducted for this. In this case, you should not resubmit the piece of content that was failed, as it will simply fail again. The most common cases of document failure are submitting content to the wrong language (sending Arabic content to an English config for instance) and content that does not have enough text to analyze (such as ASCII art and the like) [block:api-header] { "type": "basic", "title": "Keeping Track Of Your Content" } [/block] Because order and batch are not preserved on the Semantria side, it is up to you to keep track of what you submitted and received. There are several ways for you to identify your content. 1. Each document can have a unique **id** associated with it. This is returned to you by Semantria when you receive the processed data. You can use this **id** to update the status on your side. Additionally, you can request the status of a document via its **id**. 2. Each document can also have a tag field. This is a string field you can fill in with additional information you might use to keep track documents, such as a project ID. You can check on your side to see that you submitted 1,000 documents for tag "my_project" and received 1,000 documents back with that tag. You cannot request status on a tag. 3. You can submit and retrieve by job_id. This is a string value you can set when you submit and retrieve documents. it is intended to allow you to separate out processing streams of content for routing or failover purposes on your side, not as a unique ID per batch of content. * If you submit by a job_id, you must retrieve via that same job_id. Retrieving by config_id will not retrieve documents submitted with a job_id. This does not prevent you from setting the config_id when submitting with a job_id, you just cannot retrieve by that config_id. * The total number of unique job_ids you use during a 24hr period must not exceed 100. Note [block:callout] { "type": "warning", "body": "It is possible to send two documents with the same document id, as long as you send them to different configurations. If you send two documents with the same document id to the same configuration, the latter document sent will overwrite the former. Data loss may occur.", "title": "Duplicate Document ID" } [/block] [block:callout] { "type": "warning", "body": "If a user tries to process a document or analysis that has already been sent and processed but has not yet been retrieved, the server will override the previous analysis, change its status to QUEUED, and process the newer document." } [/block] [block:api-header] { "type": "basic", "title": "Data Processing" } [/block] There are four types of processing a document or group of documents in a Semantria analysis. **Queue**: submit a document or batch of documents for Detailed analysis or submit a collection for Discovery analysis. * Queue with a POST method and the server will return with an HTTP status. * For example, queuing documents for analysis is like lining up bottles to be filled by the milkman. **Request**: determine the status of a certain document with its document ID. * Request with a GET request and the server will respond with either the processed results of the current status: QUEUED, PROCESSED, or FAILED. * For example, requesting a document is like asking the status of a specific bottle-- the milkman will either give you the filled bottle or tell you why you can't have it yet. * In a request call, if the server responds with a "PROCESSED" status, it will also return the corresponding processed data. * In a request call, if the server responds with a "QUEUED" or "FAILED" status, a corresponding reason or error will accompany it. **Retrieve**: return all processed documents. * Retrieve with a GET request and the server will return the results of all documents that have been processed. It will return nothing if no documents are processed. * For example, retrieving documents is like asking the milkman for any and all full bottles. **Cancel**: delete a queued document if Semantria has not processed it yet. * This is a DELETE request. * For example, cancelling a document is like removing the empty bottle before it has been filled by the milkman. [block:api-header] { "type": "basic", "title": "Detailed Mode Processing Methods" } [/block] ## Queuing Documents [block:callout] { "type": "info", "body": "Queuing: submitting a document or batch of documents for Detailed analysis." } [/block] Users must queue documents into the API for processing. A document can be processed with a specific configuration (by using a particular config_id) or with the default configuration (by passing nothing in the config_id field. Single documents under 2KB in size should come back in a few seconds. **One Document:** For individual documents, the URL is https://api.semantria.com/document.[json|xml]. The **config_id** parameter should be submitted as part of the url, like in the example below. The request body should contain an XML or JSON object with three fields: the document ID, the text to be analyzed, and an optional tag in the POST request. After submitting documents to be queued, each document will be analyzed independently of the others. Semantria API will return an analysis for each document. [block:callout] { "type": "info", "body": "The server will process each document independently of any other processes or documents. Documents will not influence each other in processing." } [/block]
[block:api-header] { "type": "basic", "title": "Data processing essentials" } [/block] Semantria is an asynchronous API. This means: * You submit content to us and retrieve content separately. * You can scale your content submission rates as you are not waiting on us to hand data back before you can submit more. * You may receive data back in a different order than you submitted it. Batches of content are not necessarily preserved * If you use the callback retrieval mechanism, the batches remain the same, but the order might be different * If you use auto response or polling, the batch membership may also change * If you have multiple machines sending and receiving content, one machine may receive a processed document that was submitted by another. * Every piece of content is processed by a Semantria configuration. If you don't specify one, your default configuration will be used. [block:api-header] { "type": "basic", "title": "Handling Failures" } [/block] There are several types of failures during the submission and processing of content. 1. The submission is itself invalid in some way such as invalid JSON. In this case no documents are queued, and no API credits are deducted. You need to correct the errors and resubmit. You will know the submission is invalid if you receive anything other than a 200-series HTTP status response. 2. The submission is valid but the content itself is failed. In this case you will receive the document back, with a FAILED status and an error message stating why it was failed. Credits are deducted for this. In this case, you should not resubmit the piece of content that was failed, as it will simply fail again. The most common cases of document failure are submitting content to the wrong language (sending Arabic content to an English config for instance) and content that does not have enough text to analyze (such as ASCII art and the like) [block:api-header] { "type": "basic", "title": "Keeping Track Of Your Content" } [/block] Because order and batch are not preserved on the Semantria side, it is up to you to keep track of what you submitted and received. There are several ways for you to identify your content. 1. Each document can have a unique **id** associated with it. This is returned to you by Semantria when you receive the processed data. You can use this **id** to update the status on your side. Additionally, you can request the status of a document via its **id**. 2. Each document can also have a tag field. This is a string field you can fill in with additional information you might use to keep track documents, such as a project ID. You can check on your side to see that you submitted 1,000 documents for tag "my_project" and received 1,000 documents back with that tag. You cannot request status on a tag. 3. You can submit and retrieve by job_id. This is a string value you can set when you submit and retrieve documents. it is intended to allow you to separate out processing streams of content for routing or failover purposes on your side, not as a unique ID per batch of content. * If you submit by a job_id, you must retrieve via that same job_id. Retrieving by config_id will not retrieve documents submitted with a job_id. This does not prevent you from setting the config_id when submitting with a job_id, you just cannot retrieve by that config_id. * The total number of unique job_ids you use during a 24hr period must not exceed 100. Note [block:callout] { "type": "warning", "body": "It is possible to send two documents with the same document id, as long as you send them to different configurations. If you send two documents with the same document id to the same configuration, the latter document sent will overwrite the former. Data loss may occur.", "title": "Duplicate Document ID" } [/block] [block:callout] { "type": "warning", "body": "If a user tries to process a document or analysis that has already been sent and processed but has not yet been retrieved, the server will override the previous analysis, change its status to QUEUED, and process the newer document." } [/block] [block:api-header] { "type": "basic", "title": "Data Processing" } [/block] There are four types of processing a document or group of documents in a Semantria analysis. **Queue**: submit a document or batch of documents for Detailed analysis or submit a collection for Discovery analysis. * Queue with a POST method and the server will return with an HTTP status. * For example, queuing documents for analysis is like lining up bottles to be filled by the milkman. **Request**: determine the status of a certain document with its document ID. * Request with a GET request and the server will respond with either the processed results of the current status: QUEUED, PROCESSED, or FAILED. * For example, requesting a document is like asking the status of a specific bottle-- the milkman will either give you the filled bottle or tell you why you can't have it yet. * In a request call, if the server responds with a "PROCESSED" status, it will also return the corresponding processed data. * In a request call, if the server responds with a "QUEUED" or "FAILED" status, a corresponding reason or error will accompany it. **Retrieve**: return all processed documents. * Retrieve with a GET request and the server will return the results of all documents that have been processed. It will return nothing if no documents are processed. * For example, retrieving documents is like asking the milkman for any and all full bottles. **Cancel**: delete a queued document if Semantria has not processed it yet. * This is a DELETE request. * For example, cancelling a document is like removing the empty bottle before it has been filled by the milkman. [block:api-header] { "type": "basic", "title": "Detailed Mode Processing Methods" } [/block] ## Queuing Documents [block:callout] { "type": "info", "body": "Queuing: submitting a document or batch of documents for Detailed analysis." } [/block] Users must queue documents into the API for processing. A document can be processed with a specific configuration (by using a particular config_id) or with the default configuration (by passing nothing in the config_id field. Single documents under 2KB in size should come back in a few seconds. **One Document:** For individual documents, the URL is https://api.semantria.com/document.[json|xml]. The **config_id** parameter should be submitted as part of the url, like in the example below. The request body should contain an XML or JSON object with three fields: the document ID, the text to be analyzed, and an optional tag in the POST request. After submitting documents to be queued, each document will be analyzed independently of the others. Semantria API will return an analysis for each document. [block:callout] { "type": "info", "body": "The server will process each document independently of any other processes or documents. Documents will not influence each other in processing." } [/block]
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2bf","createdAt":"2015-07-23T22:47:15.051Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":28,"body":"Using a callback URL (also called web hook) is our recommended choice for retrieving data from Semantria API. Semantria delivers processed data to your callback URL automatically. The callback URL is set per config, so you can route different configs to different handlers. The URL must end in .json.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/pL7e9ueUTVOIIyv0TJ9C_web-hook.png\",\n        \"web-hook.png\",\n        \"491\",\n        \"356\",\n        \"#cf2424\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n**Pros**\n  * Almost real time analysis\n  * Most convenient and transparent data retrieval\n  * Delivers processed data automatically as soon as it is ready\n  * Consistent batch size and sequence of documents\n  * Accessible outside a client's company\n\n**Cons**\n  * Requires an open port in the firewall in order to provide a callback URL for data delivery\n  * Callback URL must be open (Semantria doesn't support authentication for callback URLs)\n\n**Use Cases**\n  * Users with a URL available for data delivery\n  * Those who want fast, automatic data retrieval while preserving batch structure\n\n**Troubleshooting**\n\nIn the callback URL mechanism, Semantria sends a POST callback request and waits for a 200 response. If the remote application responds otherwise, Semantria considers it a failure and will try again. After three failures, Semantria will drop the processed data.","excerpt":"","slug":"callback","type":"basic","title":"Callback","__v":0,"childrenPages":[]}

Callback


Using a callback URL (also called web hook) is our recommended choice for retrieving data from Semantria API. Semantria delivers processed data to your callback URL automatically. The callback URL is set per config, so you can route different configs to different handlers. The URL must end in .json. [block:image] { "images": [ { "image": [ "https://files.readme.io/pL7e9ueUTVOIIyv0TJ9C_web-hook.png", "web-hook.png", "491", "356", "#cf2424", "" ] } ] } [/block] **Pros** * Almost real time analysis * Most convenient and transparent data retrieval * Delivers processed data automatically as soon as it is ready * Consistent batch size and sequence of documents * Accessible outside a client's company **Cons** * Requires an open port in the firewall in order to provide a callback URL for data delivery * Callback URL must be open (Semantria doesn't support authentication for callback URLs) **Use Cases** * Users with a URL available for data delivery * Those who want fast, automatic data retrieval while preserving batch structure **Troubleshooting** In the callback URL mechanism, Semantria sends a POST callback request and waits for a 200 response. If the remote application responds otherwise, Semantria considers it a failure and will try again. After three failures, Semantria will drop the processed data.
Using a callback URL (also called web hook) is our recommended choice for retrieving data from Semantria API. Semantria delivers processed data to your callback URL automatically. The callback URL is set per config, so you can route different configs to different handlers. The URL must end in .json. [block:image] { "images": [ { "image": [ "https://files.readme.io/pL7e9ueUTVOIIyv0TJ9C_web-hook.png", "web-hook.png", "491", "356", "#cf2424", "" ] } ] } [/block] **Pros** * Almost real time analysis * Most convenient and transparent data retrieval * Delivers processed data automatically as soon as it is ready * Consistent batch size and sequence of documents * Accessible outside a client's company **Cons** * Requires an open port in the firewall in order to provide a callback URL for data delivery * Callback URL must be open (Semantria doesn't support authentication for callback URLs) **Use Cases** * Users with a URL available for data delivery * Those who want fast, automatic data retrieval while preserving batch structure **Troubleshooting** In the callback URL mechanism, Semantria sends a POST callback request and waits for a 200 response. If the remote application responds otherwise, Semantria considers it a failure and will try again. After three failures, Semantria will drop the processed data.
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c0","createdAt":"2015-07-23T22:46:05.962Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":29,"body":"Polling is the simplest method of retrieving data from Semantria API. Users directly request processed data whenever they need it. Client-side applications may push data as single documents or in batches of 100 documents per API call. Semantria for Excel uses the pulling mechanism.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/cOtEI82SS9G1V9xF7VpM_pulling.png\",\n        \"pulling.png\",\n        \"487\",\n        \"354\",\n        \"#e42222\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n**Pros**\n  * Can be used in a distributed environment where one application pushes the data and another retrieves the processed results\n  * Completely user-managed; no control on Semantria's side\n\n**Cons**\n  * Requires one job for pushing data and another for retrieving data to run simultaneously\n  * Incoming and outgoing batch size might be inconsistent; pulling will return batches of 100 documents per API call, regardless of incoming batch size\n  * Sequence of documents may change\n\n**Use Cases**\n  * Casual users\n  * Those who cannot provide a callback URL\n  * Not for those interested in real-time data processing","excerpt":"","slug":"pulling","type":"basic","title":"Polling","__v":0,"childrenPages":[]}

Polling


Polling is the simplest method of retrieving data from Semantria API. Users directly request processed data whenever they need it. Client-side applications may push data as single documents or in batches of 100 documents per API call. Semantria for Excel uses the pulling mechanism. [block:image] { "images": [ { "image": [ "https://files.readme.io/cOtEI82SS9G1V9xF7VpM_pulling.png", "pulling.png", "487", "354", "#e42222", "" ] } ] } [/block] **Pros** * Can be used in a distributed environment where one application pushes the data and another retrieves the processed results * Completely user-managed; no control on Semantria's side **Cons** * Requires one job for pushing data and another for retrieving data to run simultaneously * Incoming and outgoing batch size might be inconsistent; pulling will return batches of 100 documents per API call, regardless of incoming batch size * Sequence of documents may change **Use Cases** * Casual users * Those who cannot provide a callback URL * Not for those interested in real-time data processing
Polling is the simplest method of retrieving data from Semantria API. Users directly request processed data whenever they need it. Client-side applications may push data as single documents or in batches of 100 documents per API call. Semantria for Excel uses the pulling mechanism. [block:image] { "images": [ { "image": [ "https://files.readme.io/cOtEI82SS9G1V9xF7VpM_pulling.png", "pulling.png", "487", "354", "#e42222", "" ] } ] } [/block] **Pros** * Can be used in a distributed environment where one application pushes the data and another retrieves the processed results * Completely user-managed; no control on Semantria's side **Cons** * Requires one job for pushing data and another for retrieving data to run simultaneously * Incoming and outgoing batch size might be inconsistent; pulling will return batches of 100 documents per API call, regardless of incoming batch size * Sequence of documents may change **Use Cases** * Casual users * Those who cannot provide a callback URL * Not for those interested in real-time data processing
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c1","createdAt":"2015-07-23T22:46:55.929Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":30,"body":"Auto-response retrieves the processed results for the n-th document during the n+1-st request.This allows for almost real time data retrieval. For example, say you are processing constantly updating data such as a Twitter stream. On Second 1, you submit documents 1 to 10. Since this is your first submission of the day, no documents are returned to you. On Second 2, you submit documents 11 to 20. This time you will receive documents 1-10 back in the submission response. On Second 3, you submit documents 21-30, and you get back documents 11-20, and so on.\n\nIf you are processing a bounded amount of content, such as a set of 1,000 survey responses or the like, auto-response is not a good choice, as you need to remember to make another call at the end to get the last submitted batch. Auto-response is really designed for a constantly updating stream of data with no beginning and no end.\n\nAuto-response is disabled by default and is enabled per-configuration. If there is no processed data, Semantria will respond with a 202 HTTP status.\n\nSemantria will respond with an array of up to n processed documents following the status message. The default number of documents is 2 for almost real-time speed. For high volume, we suggest 100 or more documents. Please contact us to make a change.\n[block:image]\n{\n  \"images\": [\n    {\n      \"image\": [\n        \"https://files.readme.io/fJew24oReKp52ZNePELz_auto-response.png\",\n        \"auto-response.png\",\n        \"497\",\n        \"314\",\n        \"#c23535\",\n        \"\"\n      ]\n    }\n  ]\n}\n[/block]\n**Pros**\n  * Fastest retrieval option: almost real time\n  * Automatic: does not require dedicated data retrieval job to get processed data back\n\n**Cons**\n  * Batch size and sequence not preserved in retrieval\n  * Does not allow for retrieving large amounts of processed data at once\n\n**Use Cases**\n  * Users interested in 24/7 data processing and continuous usage\n  * Users who want to retrieve data as fast as possible\n  * Those who want to monitor social media trends\n  * Business intelligence and trending services","excerpt":"","slug":"auto-response","type":"basic","title":"Auto-response","__v":0,"childrenPages":[]}

Auto-response


Auto-response retrieves the processed results for the n-th document during the n+1-st request.This allows for almost real time data retrieval. For example, say you are processing constantly updating data such as a Twitter stream. On Second 1, you submit documents 1 to 10. Since this is your first submission of the day, no documents are returned to you. On Second 2, you submit documents 11 to 20. This time you will receive documents 1-10 back in the submission response. On Second 3, you submit documents 21-30, and you get back documents 11-20, and so on. If you are processing a bounded amount of content, such as a set of 1,000 survey responses or the like, auto-response is not a good choice, as you need to remember to make another call at the end to get the last submitted batch. Auto-response is really designed for a constantly updating stream of data with no beginning and no end. Auto-response is disabled by default and is enabled per-configuration. If there is no processed data, Semantria will respond with a 202 HTTP status. Semantria will respond with an array of up to n processed documents following the status message. The default number of documents is 2 for almost real-time speed. For high volume, we suggest 100 or more documents. Please contact us to make a change. [block:image] { "images": [ { "image": [ "https://files.readme.io/fJew24oReKp52ZNePELz_auto-response.png", "auto-response.png", "497", "314", "#c23535", "" ] } ] } [/block] **Pros** * Fastest retrieval option: almost real time * Automatic: does not require dedicated data retrieval job to get processed data back **Cons** * Batch size and sequence not preserved in retrieval * Does not allow for retrieving large amounts of processed data at once **Use Cases** * Users interested in 24/7 data processing and continuous usage * Users who want to retrieve data as fast as possible * Those who want to monitor social media trends * Business intelligence and trending services
Auto-response retrieves the processed results for the n-th document during the n+1-st request.This allows for almost real time data retrieval. For example, say you are processing constantly updating data such as a Twitter stream. On Second 1, you submit documents 1 to 10. Since this is your first submission of the day, no documents are returned to you. On Second 2, you submit documents 11 to 20. This time you will receive documents 1-10 back in the submission response. On Second 3, you submit documents 21-30, and you get back documents 11-20, and so on. If you are processing a bounded amount of content, such as a set of 1,000 survey responses or the like, auto-response is not a good choice, as you need to remember to make another call at the end to get the last submitted batch. Auto-response is really designed for a constantly updating stream of data with no beginning and no end. Auto-response is disabled by default and is enabled per-configuration. If there is no processed data, Semantria will respond with a 202 HTTP status. Semantria will respond with an array of up to n processed documents following the status message. The default number of documents is 2 for almost real-time speed. For high volume, we suggest 100 or more documents. Please contact us to make a change. [block:image] { "images": [ { "image": [ "https://files.readme.io/fJew24oReKp52ZNePELz_auto-response.png", "auto-response.png", "497", "314", "#c23535", "" ] } ] } [/block] **Pros** * Fastest retrieval option: almost real time * Automatic: does not require dedicated data retrieval job to get processed data back **Cons** * Batch size and sequence not preserved in retrieval * Does not allow for retrieving large amounts of processed data at once **Use Cases** * Users interested in 24/7 data processing and continuous usage * Users who want to retrieve data as fast as possible * Those who want to monitor social media trends * Business intelligence and trending services
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c2","createdAt":"2015-07-16T20:37:21.017Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":31,"body":"For best performance, we recommend you batch your content. There are several factors to consider when creating a batch:\n\n* All documents in a batch must be processed by the same configuration\n* All documents in a single batch must be processed before the documents are returned.\n* Larger batches lead to higher overall throughput but longer latency on individual batches\n* The maximum batch size is limited by your subscription.","excerpt":"","slug":"batching","type":"basic","title":"Batching","__v":0,"childrenPages":[]}

Batching


For best performance, we recommend you batch your content. There are several factors to consider when creating a batch: * All documents in a batch must be processed by the same configuration * All documents in a single batch must be processed before the documents are returned. * Larger batches lead to higher overall throughput but longer latency on individual batches * The maximum batch size is limited by your subscription.
For best performance, we recommend you batch your content. There are several factors to consider when creating a batch: * All documents in a batch must be processed by the same configuration * All documents in a single batch must be processed before the documents are returned. * Larger batches lead to higher overall throughput but longer latency on individual batches * The maximum batch size is limited by your subscription.
{"category":"577e4bf24159cd1900d5d2b0","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c3","createdAt":"2015-07-23T22:50:58.793Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":32,"body":"* Make sure to include a URL in your configuration settings if you would like to use a callback URL.\n* If you want to use a callback URL or manually pull your results, be sure that auto_response is turned off.\n* If you are using auto-response for a set amount of data, you must make one more call after submitting the last batch of data.","excerpt":"","slug":"troubleshooting-tips","type":"basic","title":"Troubleshooting Tips","__v":0,"childrenPages":[]}

Troubleshooting Tips


* Make sure to include a URL in your configuration settings if you would like to use a callback URL. * If you want to use a callback URL or manually pull your results, be sure that auto_response is turned off. * If you are using auto-response for a set amount of data, you must make one more call after submitting the last batch of data.
* Make sure to include a URL in your configuration settings if you would like to use a callback URL. * If you want to use a callback URL or manually pull your results, be sure that auto_response is turned off. * If you are using auto-response for a set amount of data, you must make one more call after submitting the last batch of data.
{"__v":0,"_id":"577e4bf24159cd1900d5d2f2","api":{"examples":{"codes":[{"language":"http","code":" POST https://api.semantria.com/document.json?config_id=cd2e7341-a3c2-4fb4-9d3a\n  -779e8b0a5eff\n{\n  \"id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n  \"text\" : \"A chunk of text for processing\",\n  \"tag\" : \"A tag is any text (up to 50 characters) used like a marker.\"\n}","name":""},{"language":"xml","code":"POST https://api.semantria.com/document.xml?config_id=cd2e7341-a3c2-4fb4-9d3a-\n  779e8b0a5eff\n<document>\n   <id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n   <text>A chunk of text for processing</text>\n   <tag>A tag is any text (up to 50 characters) used like a marker.</tag>\n</document>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.queueDocument( \n  configId = \"id\",\n  {\n  \t\"id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n  \t\"text\" : \"A chunk of text for processing\",\n  \t\"tag\" : \"A tag is any text (up to 50 characters) used like a marker.\"\n  }"}]},"results":{"codes":[{"status":202,"language":"text","code":"HTTP/1.0 202 Request accepted and queued for processing.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55aeb50f826d210d00041d0c","ref":"","required":true,"desc":"(Optional) the configuration id","default":"","type":"string","name":"config_id","in":"body"}],"url":"/document.[json|xml]"},"body":"","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-07-21T21:09:35.815Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":33,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"data-processing","sync_unique":"","title":"Sending one document at a time","type":"post","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postSending one document at a time


Body JSON

config_id:
required
string
(Optional) the configuration id

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2f3","api":{"examples":{"codes":[{"language":"json","code":"POST https://api.semantria.com/document/batch.json?config_id=cd2e7341-a3c2-4fb4-\n  9d3a-779e8b0a5eff\n[\n{\n  \"id\" : \" 6F9619FF8B86D011B42D00CF4FC964FF\",\n  \"text\" : \"A chunk of text for processing\"\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n},\n{\n  \"id\" : \" 9G9619RG9286NG889E2D00CF4FBI9R7F\",\n  \"text\" : \"The next chunk of text for processing\"\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n}\n]","name":""},{"language":"xml","code":"POST https://api.semantria.com/document/batch.xml?config_id=cd2e7341-a3c2-4fb4-\n  9d3a-779e8b0a5eff\n<documents>\n  <document>\n    <id>6F9619FF8B86D011B42D00CF4FC964FF</id>\n    <text>A chunk of text for processing</text>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n  </document>\n  <document>\n    <id>9G9619RG9286NG889E2D00CF4FBI9R7F</id>\n    <text>The next chunk of text for processing </text>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n  </document>\n</documents>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.queueBatch(   \n  [\n    {\n  \t\t\"id\" : \" 6F9619FF8B86D011B42D00CF4FC964FF\",\n  \t\t\"text\" : \"A chunk of text for processing\"\n  \t\t\"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n\t\t},\n\t\t{\n  \t\t\"id\" : \" 9G9619RG9286NG889E2D00CF4FBI9R7F\",\n  \t\t\"text\" : \"The next chunk of text for processing\"\n  \t\t\"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n\t\t}\n\t],\n  config_id = \"id\"\n)"}]},"results":{"codes":[{"name":"","code":"HTTP/1.0 202 Request accepted and queued for processing. ","language":"json","status":200},{"name":"","code":"{ 'status': 400, 'message': 'Internal Server Error: Interceptor exception. Request body is syntactically incorrect.'\n}","language":"json","status":400},{"name":"Authentication failed","code":"{'status': 401, 'message': 'Authentication failed.'}","language":"json","status":401},{"name":"Out of credits","code":"{'status': 402, 'message': 'Request is unauthorized. Documents balance exceeded. See Verify Subscription end-point for details.'}","language":"json","status":402},{"name":"Data calls limit exceeded","status":402,"language":"json","code":"{'status': 402, 'message': 'Data API calls limit has been exceeded. Please contact support@semantria.com to review your limits and troubleshoot.'}"},{"name":"Expired License","code":"{'status': 402, 'message': 'Request is unauthorized. User license is expired.'}","language":"json","status":402},{"code":"{'status': 406, 'message': 'Limit of documents per batch has been exceeded.'}","language":"json","status":406,"name":"Batch too large"},{"status":403,"language":"json","code":"ERROR: {'status': 403, 'message': 'You are trying to use configuration with the language that is forbidden for your license. Please contact Semantria support (support@semantria.com) for details.'}"},{"name":"Character limit exceeded","code":"ERROR: { 'status': 413, 'message' : 'Limit of characters per single document has been exceeded.')","language":"json","status":413}]},"settings":"","auth":"required","params":[{"_id":"55aeb6f8826d210d00041d14","ref":"","required":true,"desc":"The configuration ID","default":"","type":"string","name":"config_id","in":"body"}],"url":"/batch.[json|xml]"},"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Detailed analysis\",\n  \"body\": \"Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one).\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.\",\n  \"title\": \"Queue by configuration for efficiency\"\n}\n[/block]","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-07-21T21:11:44.864Z","editedParams":true,"editedParams2":true,"excerpt":"A batch is an array of documents, each which consists of three values: an optional unique ID, an option tag, and the text you wish to analyze.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":34,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"data-processing-multiple-documents","sync_unique":"","title":"Sending multiple documents at a time","type":"post","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postSending multiple documents at a time

A batch is an array of documents, each which consists of three values: an optional unique ID, an option tag, and the text you wish to analyze.

Body JSON

config_id:
required
string
The configuration ID
[block:callout] { "type": "info", "title": "Detailed analysis", "body": "Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one)." } [/block] [block:callout] { "type": "info", "body": "If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.", "title": "Queue by configuration for efficiency" } [/block]

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



[block:callout] { "type": "info", "title": "Detailed analysis", "body": "Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one)." } [/block] [block:callout] { "type": "info", "body": "If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.", "title": "Queue by configuration for efficiency" } [/block]
{"__v":0,"_id":"577e4bf24159cd1900d5d2f4","api":{"examples":{"codes":[{"name":"","code":"POST https://api.semantria.com/document/batch.json?config_id=cd2e7341-a3c2-4fb4-\n  9d3a-779e8b0a5eff\n[\n{\n  \"id\" : \" 6F9619FF8B86D011B42D00CF4FC964FF\",\n  \"text\" : \"A chunk of text for processing\",\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n},\n{\n  \"id\" : \" 9G9619RG9286NG889E2D00CF4FBI9R7F\",\n  \"text\" : \"The next chunk of text for processing\",\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n}\n]","language":"json"},{"code":"POST https://api.semantria.com/document/batch.xml?config_id=cd2e7341-a3c2-4fb4-\n  9d3a-779e8b0a5eff\n<documents>\n  <document>\n    <id>6F9619FF8B86D011B42D00CF4FC964FF</id>\n    <text>A chunk of text for processing</text>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n  </document>\n  <document>\n    <id>9G9619RG9286NG889E2D00CF4FBI9R7F</id>\n    <text>The next chunk of text for processing </text>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n  </document>\n</documents>","language":"xml"},{"code":"import semantria\nsession = semantria.Session(key, secret)\nsession.queueBatch(   \n  [\n    {\n  \t\t\"id\" : \" 6F9619FF8B86D011B42D00CF4FC964FF\",\n  \t\t\"text\" : \"A chunk of text for processing\"\n  \t\t\"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n\t\t},\n\t\t{\n  \t\t\"id\" : \" 9G9619RG9286NG889E2D00CF4FBI9R7F\",\n  \t\t\"text\" : \"The next chunk of text for processing\"\n  \t\t\"tag\" : \"Any text (up to 50 characters) used like a marker.\"\n\t\t}\n\t],\n  config_id = \"id\"\n)","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 202 Request accepted and queued for processing. ","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55f9c7071d5a631900cf4034","ref":"","required":false,"desc":"(Optional) config ID to process content with","default":"","type":"string","name":"config_id","in":"body"},{"_id":"55f9c7071d5a631900cf4033","ref":"","required":false,"desc":"(Optional) job_id","default":"","type":"string","name":"job_id","in":"body"}],"url":"/batch.[json|xml]"},"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"title\": \"Detailed analysis\",\n  \"body\": \"Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one).\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.\",\n  \"title\": \"Queue by configuration for efficiency\"\n}\n[/block]","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-09-16T19:46:15.581Z","editedParams":true,"editedParams2":true,"excerpt":"A job_id is not required but can be helpful to separate out content processed via particular environments (dev vs QA) or particular job streams (historical content vs live news).\n\nJob_id is only supported by the polling method (not callback or auto response) and if you submit via job_id, you must also retrieve by job_id. You can use only 100 unique job_ids per 24 hour period.\n\nIf you specified a job_id when submitting you must also specify a job_id when retrieving.\n\nYou can use both a config_id and a job_id in a document submission. \n\nA batch of documents is an array of documents for processing, which each contain the following parameters: (optional) id, text, (optional) tag","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":35,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"sending-documents-by-job_id","sync_unique":"","title":"Sending documents by job_id","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postSending documents by job_id

A job_id is not required but can be helpful to separate out content processed via particular environments (dev vs QA) or particular job streams (historical content vs live news). Job_id is only supported by the polling method (not callback or auto response) and if you submit via job_id, you must also retrieve by job_id. You can use only 100 unique job_ids per 24 hour period. If you specified a job_id when submitting you must also specify a job_id when retrieving. You can use both a config_id and a job_id in a document submission. A batch of documents is an array of documents for processing, which each contain the following parameters: (optional) id, text, (optional) tag

Body JSON

config_id:
string
(Optional) config ID to process content with
job_id:
string
(Optional) job_id
[block:callout] { "type": "info", "title": "Detailed analysis", "body": "Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one)." } [/block] [block:callout] { "type": "info", "body": "If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.", "title": "Queue by configuration for efficiency" } [/block]

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



[block:callout] { "type": "info", "title": "Detailed analysis", "body": "Users can submit a batch of documents for Detailed analysis, but each document will still be surveyed and analyzed independently (as if they were submitted one by one)." } [/block] [block:callout] { "type": "info", "body": "If submitting many documents for Detailed Mode, your analysis will be much more efficient if the whole batch is queued for the same configuration. Sorting documents into configuration-specific batches will keep your documents organized and your analysis fast.", "title": "Queue by configuration for efficiency" } [/block]
{"__v":0,"_id":"577e4bf24159cd1900d5d2f5","api":{"examples":{"codes":[{"name":"","code":"GET https://api.semantria.com/document/d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff.json","language":"text"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getDocument(\n  doc_id = \"id\",\n  config_id = \"id\"\n )"}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 200 Request accepted and served.\n{\n   \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n   \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n   \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n   \"status\" : \"PROCESSED\",\n   \"source_text\" : \"See \\\"Output Data Details\\\" chapter.\",\n   \"language\" : \"English\",\n   \"metadata\": {\n        \"author\": \"Tim Mohler, \n        \"date\": \"20160325\"\n    }, \n   \"language_score\" : 0.6972651,\n   \"sentiment_score\" : 0.8295653,\n   \"sentiment_polarity\" : \"positive\",\n   \"summary\" : \"Summary of the document’s text.\",\n   \"details\" : [\n      {\n         \"is_imperative\" : false,\n         \"is_polar : false,\n         \"words\" : [\n            {\n               \"tag\" : \"NNP\",\n               \"type\" : \"Noun\",\n               \"title\" : \"Aaron\",\n               \"stemmed\" : \"Aaron\",\n               \"is_negated\" : false,\n               \"sentiment_score\" : 0.569\n            }\n         ]\n      }\n   ],\n   \"phrases\" : [\n      {\n         \"title\" : \"friendly\",\n         \"sentiment_score\" : -0.4,\n         \"sentiment_polarity\" : \"negative\",\n         \"is_negated\" : true,\n         \"negating_phrase\" : \"not\"\n\t\t \"is_intensified\": false,\n\t\t \"type : \"detected\",\n      }\n   ],\n   \"model_sentiment\": \n\t  {\n\t\t\"sentiment_polarity\": \"neutral\",\n\t\t\"model_name\": \"default\",\n\t\t\"mixed_score\": 0.10868213325738907,\n\t\t\"negative_score\": 0.18008844554424286,\n\t\t\"neutral_score\": 0.47879499197006226,\n\t\t\"positive_score\": 0.2324344366788864\n\t  },  \n   \"auto_categories\" :[\n      {\n         \"title\" : \"Automotive\",\n         \"type\" : \"node\",\n         \"strength_score\" : 0.378,\n         \"categories\" : [\n            {\n               \"title\" : \"Moto\",\n               \"type\" : \"leaf\",\n               \"strength_score\" : 0.67\n            }\n         ]\n      }\n   ],\n   \"themes\" : [\n      {\n         \"evidence\" : 1,\n         \"is_about\" : true,\n         \"strength_score\" : 0.0,\n         \"sentiment_score\" : 0.0,\n         \"sentiment_polarity\" : \"neutral\",\n         \"title\" : \"republican moderates\"\n         \"mentions\" : [\n            {\n               \"label\" : \"Something\",\n               \"is_negated\" : true,\n               \"negating_phrase\" : \"negator\",\n               \"locations\" : [\n                  {\n                     \"offset\" : 987,\n                     \"length\" : 9\n                  }\n               ]\n            }\n         ]\n      }\n   ],\n   \"entities\" : [\n      {\n         \"type\" : \"named\",\n         \"evidence\" : 0,\n         \"confident\" : false,\n         \"is_about\" : true,\n         \"entity_type\" : \"Place\",\n         \"title\" : \"WASHINGTON\",\n         \"label\" : \"The capital of the United States of America.\",\n         \"sentiment_score\" : 1.0542796,\n         \"sentiment_polarity\" : \"positive\",\n         \"mentions\" : [\n            {\n               \"label\" : \"Something\",\n               \"is_negated\" : true,\n               \"negating_phrase\" : \"negator\",\n               \"locations\" : [\n                  {\n                     \"offset\" : 987,\n                     \"length\" : 9\n                  }\n               ]\n            }\n         ]\n         \"themes\" : [\n            {\n               \"evidence\" : 1,\n               \"is_about\" : true,\n               \"strength_score\" : 0.0,\n               \"sentiment_score\" : 0.0,\n               \"sentiment_polarity\" : \"neutral\",\n               \"title\" : \"republican moderates\"\n               \"mentions\" : [\n                  {\n                     \"label\" : \"Something\",\n                     \"is_negated\" : true,\n                     \"negating_phrase\" : \"negator\",\n                     \"locations\" : [\n                        {\n                           \"offset\" : 987,\n                           \"length\" : 9\n                        }\n                     ]\n                  }\n               ]\n            }\n         ]\n      }\n   ],\n   \"relations\" : [\n      {\n         \"type\" : \"named\",\n         \"relation_type\" : \"Occupation\",\n         \"confidence_score\" : 1.0,\n         \"extra\" : \"\",\n         \"entities\" : [\n            {\n               \"title\" : \"head judge\",\n               \"entity_type\" : \"Job Title\"\n            },\n            {\n               \"title\" : \"John Snow\",\n               \"entity_type\" : \"Person\"\n            }\n         ]\n      }\n   ],\n   \"opinions\" : [\n      {\n         \"quotation\" : \"Some opinion of John Kerry about the US.\",\n         \"type\" : \"named\",\n         \"speaker\" : \"John Kerry\",\n         \"topic\" : \"United States\",\n         \"sentiment_score\" : 0.49,\n         \"sentiment_polarity\" : \"positive\"\n      }\n   ]\n   \"topics\" : [\n      {\n         \"title\" : \"Something\",\n         \"type\" : \"concept\",\n         \"hitcount\" : 0,\n         \"strength_score\" : 0.0,\n         \"sentiment_score\" : 0.6133076,\n         \"sentiment_polarity\" : \"positive\"\n      }\n   ]\n}","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":200,"language":"xml","code":"HTTP/1.0 200 Request accepted and served.\n<document>\n  <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n  <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n  <tag>Any text (up to 50 characters) used like a marker.</tag>\n  <status>PROCESSED</status>\n  <source_text>See ”Output Data Details” chapter</source_text>\n  <language>English</language>\n  <language_score>0.6972651</language_score>\n  <sentiment_score>0.2398756</sentiment_score>\n  <sentiment_polarity>positive</sentiment_polarity>\n  <summary>Summary of the document’s text.</summary>\n  <details>\n    <sentence>\n      <is_imperative>false</is_imperative>\n      <is_polar>false</is_polar>\n        <words>\n          <word>\n            <tag>NNP</tag>\n            <type>Noun</type>\n            <title>Aaron</title>\n            <stemmed>Aaron</stemmed>\n            <is_negated>false</is_negated>\n            <sentiment_score>0.569</sentiment_score>\n          </word>\n        </words>\n      </sentence>\n    </details>\n    <phrases>\n      <phrase>\n         <title>friendly</title>\n\t\t <sentiment_score>-0.4</sentiment_score>\n         <sentiment_polarity>negative</sentiment_polarity>\t\n         <is_negated>true</is_negated>\t\n\t\t <negating_phrase>not</negating_phrase>\n\t\t <is_intensified>false</is_intensified>\n         <type>detected</type>\n      </phrase>\n    </phrases >\n\t<model_sentiment>\n\t\t <sentiment_polarity>neutral</sentiment_polarity>\n\t\t <model_name>default</model_name>\n\t\t <mixed_score>0.10868213325738907</mixed_score>\n\t\t <negative_score>0.18008844554424286</negative_score>\n\t\t <neutral_score>0.47879499197006226</neutral_score>\n\t\t <positive_score>0.2324344366788864</positive_score>\n\t</model_sentiment>\n    <auto_categories>\n      <category>\n        <title>Automotive</title>\n        <type>node</type>\n        <strength_score>0.378</strength_score>\n        <categories>\n          <category>\n            <title>Moto</title>\n            <type>leaf</type>\n            <strength_score>0.67</strength_score>\n          </category>\n        </categories>\n      </category>\n    </auto_categories>\n    <themes>\n      <theme>\n      <evidence>1</evidence>\n      <is_about>true</is_about>\n      <strength_score>0.0</strength_score>\n      <sentiment_score>0.0</sentiment_score>\n      <sentiment_polarity>neutral</sentiment_polarity>\n      &lt;title&gt;republican moderates&lt;/title&gt;\n      <mentions>\n        <mention>\n        <label>Something</label>\n        <is_negated>true</is_negated>\n        <negating_phrase>negator</negating_phrase>\n        <locations>\n          <location>\n            <offset>987</offset>\n            <length>9</length>\n          </location>\n        </locations>\n      </mention>\n    </mentions>\n  </theme>\n</themes>\n\n<entities>\n  <entity>\n    <type>named</type>\n    <evidence>0</evidence>\n    <confident>false</confident>\n    <is_about>true</is_about>\n    <entity_type>Place</entity_type>\n    <title>WASHINGTON</title>\n    <label>The capital of the United States of America.</label>\n    <sentiment_score>1.0542796</sentiment_score>\n    <sentiment_polarity>positive</sentiment_polarity>\n    <mentions> <mention>\n    <label>Something</label>\n    <is_negated>true</is_negated>\n    <negating_phrase>negator</negating_phrase>\n    <locations> <location>\n    <offset>987</offset>\n    <length>9</length>\n    </location> </locations>\n    </mention> </mentions>\n    <themes>\n      <theme>\n        <evidence>1</evidence>\n        <is_about>true</is_about>\n        <strength_score>0.0</strength_score>\n        <sentiment_score>0.0</sentiment_score>\n        <sentiment_polarity>neutral</sentiment_polarity>\n        <title>republican moderates</title>\n          <mentions>\n            <mention>\n              <label>Something</label>\n              <is_negated>true</is_negated>\n              <negating_phrase>negator</negating_phrase>\n              <locations>\n                <location>\n                  <offset>987</offset>\n                  <length>9</length>\n                </location>\n              </locations>\n            </mention>\n          </mentions>\n        </theme>\n      </themes>\n    </entity>\n  </entities>\n\n  <relations>\n    <relation>\n      <type>named</type>\n      <relation_type>Occupation</relation_type>\n      <confidence_score>1.0</confidence_score>\n      <extra>took</extra>\n      <entities>\n        <entity>\n          <title>head judge</title>\n          <entity_type>Job Title</entity_type>\n        </entity>\n\n        <entity>\n          <title>John Snow</title>\n          <entity_type>Person</entity_type>\n        </entity>\n      </entities>\n    </relation>\n  </relations>\n  <opinions>\n    <opinion>\n      <quotation>Some opinion of John Kerry about the US.</quotation>\n      <type>named</type>\n      <speaker>John Kerry</speaker>\n      <topic>United States</topic>\n      <sentiment_score>0.49</sentiment_score>\n      <sentiment_polarity>positive</sentiment_polarity>\n    </opinion>\n  </opinions>\n    <topics>\n      <topic>\n        <title>Something</title>\n        <hitcount>0</hitcount>\n          <sentiment_score>0.6133076</sentiment_score>\n          <sentiment_polarity>positive</sentiment_polarity>\n          <strength_score>0.6133076</strength_score>\n          <type>concept</type>\n      </topic>\n    </topics>\n</document>"}]},"settings":"","auth":"required","params":[{"_id":"55aeba1dc97a1a0d0022455f","ref":"","required":false,"desc":"Optional if the document was processed with the default configuration. Required for non-default configurations.","default":"","type":"string","name":"config_id","in":"query"},{"_id":"55afebdfd7624e3700e4fafb","ref":"","required":true,"desc":"the id of the desired document","default":"","type":"string","name":"document_id","in":"path"}],"url":"/document/:document_id.[json|xml]"},"body":"","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-07-21T21:31:09.710Z","editedParams":true,"editedParams2":true,"excerpt":"Requesting: Asking the status (or processed results) of a specific document","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":36,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"requesting-documents","sync_unique":"","title":"Requesting specific documents","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRequesting specific documents

Requesting: Asking the status (or processed results) of a specific document

Path Params

document_id:
required
string
the id of the desired document

Query Params

config_id:
string
Optional if the document was processed with the default configuration. Required for non-default configurations.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2f6","api":{"examples":{"codes":[{"language":"http","code":"GET https://api.semantria.com/document/processed.json?config_id=cd2e7341-a3c2-4fb4-\n  9d3a-779e8b0a5eff","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getProcessedDocuments(\n  config_id = \"id\"\n)"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and served.\n[\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Document\" section\n   },\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Document\" section\n   }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":202,"language":"xml","code":"HTTP/1.0 202 Request accepted and served.\n<documents>\n  <document>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <status>PROCESSED</status>\n    <!-- Accompanying output as described in the “Request Document” section -->\n  </document>\n  <document>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <status>PROCESSED</status>\n    <!-- Accompanying output as described in the “Request Document” section -->\n  </document>\n</documents>"}]},"settings":"","auth":"required","params":[{"_id":"55aebc31555b900d0036d161","ref":"","required":false,"desc":"(Optional) the configuration id","default":"","type":"string","name":"config_id","in":"query"},{"_id":"56d9faf92716531d00b1a5d5","ref":"","required":false,"desc":"(Optional) the job_id","default":"","type":"string","name":"job_id","in":"query"}],"url":"/document/processed.[json | xml]"},"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.\",\n  \"title\": \"Batch Size Limit\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"Once a document has been retrieved, it will be removed from the Semantria systems.\"\n}\n[/block]","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-07-21T21:40:01.062Z","editedParams":true,"editedParams2":true,"excerpt":"This call retrieves as many processed documents as fit into your maximum batch size. Note the HTTP code will not work as is because it requires a timestamp and nonce. Using our SDK will take care of this for you.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":37,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"retrieving-documents","sync_unique":"","title":"Retrieving Documents","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRetrieving Documents

This call retrieves as many processed documents as fit into your maximum batch size. Note the HTTP code will not work as is because it requires a timestamp and nonce. Using our SDK will take care of this for you.

Query Params

config_id:
string
(Optional) the configuration id
job_id:
string
(Optional) the job_id
[block:callout] { "type": "info", "body": "By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.", "title": "Batch Size Limit" } [/block] [block:callout] { "type": "warning", "body": "Once a document has been retrieved, it will be removed from the Semantria systems." } [/block]

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



[block:callout] { "type": "info", "body": "By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.", "title": "Batch Size Limit" } [/block] [block:callout] { "type": "warning", "body": "Once a document has been retrieved, it will be removed from the Semantria systems." } [/block]
{"__v":0,"_id":"577e4bf24159cd1900d5d2f7","api":{"examples":{"codes":[{"language":"text","code":"DELETE https://api.semantria.com/document/d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff.json","name":""}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and served.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55aec5e3826d210d00041d4d","ref":"","required":true,"desc":"the document id","default":"","type":"string","name":"document_id","in":"path"}],"url":"/document/:document_id.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-07-21T22:19:20.434Z","editedParams":true,"editedParams2":true,"excerpt":"Canceling: deleting a queued document if Semantria has not processed it yet.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":38,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"cancelling-documents","sync_unique":"","title":"Canceling documents","type":"delete","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteCanceling documents

Canceling: deleting a queued document if Semantria has not processed it yet.

Path Params

document_id:
required
string
the document id

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d2f8","api":{"examples":{"codes":[{"name":"","code":"GET https://api.semantria.com/document/processed.json?job_id=1","language":"http"},{"code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getProcessedDocumentsByJobId(\n  job_id = \"id\"\n)","language":"python"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and served.\n[\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Document\" section\n   },\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Document\" section\n   }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":202,"language":"xml","code":"HTTP/1.0 202 Request accepted and served.\n<documents>\n  <document>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <status>PROCESSED</status>\n    <!-- Accompanying output as described in the “Request Document” section -->\n  </document>\n  <document>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <status>PROCESSED</status>\n    <!-- Accompanying output as described in the “Request Document” section -->\n  </document>\n</documents>"}]},"settings":"","auth":"required","params":[{"_id":"55aebc31555b900d0036d161","ref":"","required":false,"desc":"The job_id","default":"","type":"string","name":"job_id","in":"query"}],"url":"/document/processed.[json | xml]"},"body":"[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.\",\n  \"title\": \"Batch Size Limit\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"Once a document has been retrieved, it will be removed from the Semantria systems.\"\n}\n[/block]","category":"577e4bf24159cd1900d5d2b1","createdAt":"2015-09-16T19:54:53.317Z","editedParams":true,"editedParams2":true,"excerpt":"If you submitted via job_id, you must also retrieve by that job_id. You cannot use a config_id when retrieving by job_id, although you can use a config_id when submitting with a job_id.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":39,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"retrieving-documents-by-job_id","sync_unique":"","title":"Retrieving Documents by job_id","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRetrieving Documents by job_id

If you submitted via job_id, you must also retrieve by that job_id. You cannot use a config_id when retrieving by job_id, although you can use a config_id when submitting with a job_id.

Query Params

job_id:
string
The job_id
[block:callout] { "type": "info", "body": "By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.", "title": "Batch Size Limit" } [/block] [block:callout] { "type": "warning", "body": "Once a document has been retrieved, it will be removed from the Semantria systems." } [/block]

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



[block:callout] { "type": "info", "body": "By default, the server responds to retrieval requests with 100 documents per batch. To increase this limit, please contact us.", "title": "Batch Size Limit" } [/block] [block:callout] { "type": "warning", "body": "Once a document has been retrieved, it will be removed from the Semantria systems." } [/block]
{"category":"577e4bf24159cd1900d5d2b1","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d2f9","createdAt":"2016-03-21T13:47:32.303Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"name":"JSON - document","language":"json","code":"{\"id\":\"1234\", \n \"text\": \"Some sample text\", \n \"metadata\": {\"source\": \"twitter\", \"datetime\": \"2016-01-08T17:21:01\", ...},\n }"},{"name":"JSON - collection","language":"json","code":"{\n \"id\": \"x-collection\",\n \"metadata\": [\"abc\", 123],\n \"tag\": \"my-tag\",\n \"documents\": [\"Some text\", \"Some other text\"]\n}"}]},"results":{"codes":[{"status":202,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/document.json"},"isReference":false,"order":40,"body":"The metadata field is schema less and can contain any valid JSON you wish to send. You can attach metadata to individual documents and to collections as well.","excerpt":"In addition to the required fields, you can also submit metadata about your content. Metadata submissions are allowed only when using the JSON format of the endpoint. No processing of the metadata elements is done by Semantria and the metadata fields will be returned to you with the document when you retrieve it.","slug":"sending-metadata-with-your-documents","type":"post","title":"Sending metadata with your documents","__v":0,"childrenPages":[]}

postSending metadata with your documents

In addition to the required fields, you can also submit metadata about your content. Metadata submissions are allowed only when using the JSON format of the endpoint. No processing of the metadata elements is done by Semantria and the metadata fields will be returned to you with the document when you retrieve it.

The metadata field is schema less and can contain any valid JSON you wish to send. You can attach metadata to individual documents and to collections as well.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



The metadata field is schema less and can contain any valid JSON you wish to send. You can attach metadata to individual documents and to collections as well.
{"__v":0,"_id":"577e4bf34159cd1900d5d317","api":{"examples":{"codes":[{"language":"json","code":"POST https://api.semantria.com/collection.json?config_id=cd2e7341-a3c2-4fb4-9d3a-\n  779e8b0a5eff\n{\n  \"id\" : \"6F9619FF8B86D011B42D00CF4FC964FF\",\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n  \"documents\" : [\n    \"The first chunk of text for processing\",\n    \"Another chunk of text for processing\",\n    \"Third chunk of text for processing\"\n  ]\n}","name":""},{"language":"xml","code":"POST https://api.semantria.com/collection.xml?config_id=cd2e7341-a3c2-4fb4-9d3a-\n          779e8b0a5eff\n\n<collection>\n  <id>6F9619FF8B86D011B42D00CF4FC964FF</id>\n  <tag>Any text (up to 50 characters) used like a marker.</tag>\n  <documents>\n    <document>The first chunk of text for processing</document>\n    <document>Another chunk of text for processing</document>\n    <document>Third chunk of text for processing</document>\n  </documents>\n</collection>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.queueCollection(\n  config_id = \"id\",\n  {\n  \t\"id\" : \"6F9619FF8B86D011B42D00CF4FC964FF\",\n  \t\"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n  \t\"documents\" : [\n    \t\"The first chunk of text for processing\",\n    \t\"Another chunk of text for processing\",\n    \t\"Third chunk of text for processing\"\n  \t]\n\t}\n)"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and queued for processing.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55f9ccd6d6f4370d001d9957","ref":"","required":false,"desc":"(Optional) ID of config to use","default":"","type":"string","name":"config_id","in":"body"}],"url":"https:/api.semantria.com/collection.[json|xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-07-21T22:55:21.532Z","editedParams":true,"editedParams2":true,"excerpt":"This method submits an array of documents to be analyzed in relation to each other and returns one output. Discovery analysis will contain a summary of sentiment, named entity extraction, themes, and categorization for all the documents in the collection.\n\nA collection consists of an array of elements: (optional) ID, (optional) tag and an array of pieces of text.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":41,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"sending-a-collection-of-documents","sync_unique":"","title":"Sending a collection of documents","type":"post","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postSending a collection of documents

This method submits an array of documents to be analyzed in relation to each other and returns one output. Discovery analysis will contain a summary of sentiment, named entity extraction, themes, and categorization for all the documents in the collection. A collection consists of an array of elements: (optional) ID, (optional) tag and an array of pieces of text.

Body JSON

config_id:
string
(Optional) ID of config to use

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d318","api":{"examples":{"codes":[{"language":"json","code":"POST https://api.semantria.com/collection.json?config_id=cd2e7341-a3c2-4fb4-9d3a-\n  779e8b0a5eff&job_id=1\n{\n  \"id\" : \"6F9619FF8B86D011B42D00CF4FC964FF\",\n  \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n  \"documents\" : [\n    \"The first chunk of text for processing\",\n    \"Another chunk of text for processing\",\n    \"Third chunk of text for processing\"\n  ]\n}","name":""},{"language":"xml","code":"POST https://api.semantria.com/collection.xml?config_id=cd2e7341-a3c2-4fb4-9d3a-\n          779e8b0a5eff\n\n<collection>\n  <id>6F9619FF8B86D011B42D00CF4FC964FF</id>\n  <tag>Any text (up to 50 characters) used like a marker.</tag>\n  <documents>\n    <document>The first chunk of text for processing</document>\n    <document>Another chunk of text for processing</document>\n    <document>Third chunk of text for processing</document>\n  </documents>\n</collection>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.queueCollection(\n  config_id = \"id\",\n  job_id = \"id\"\n  {\n  \t\"id\" : \"6F9619FF8B86D011B42D00CF4FC964FF\",\n  \t\"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n  \t\"documents\" : [\n    \t\"The first chunk of text for processing\",\n    \t\"Another chunk of text for processing\",\n    \t\"Third chunk of text for processing\"\n  \t]\n\t}\n)"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and queued for processing.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55f9ccd6d6f4370d001d9957","ref":"","required":false,"desc":"(Optional) ID of config to use","default":"","type":"string","name":"config_id","in":"body"},{"_id":"55f9cf0d30f2600d00f934b9","ref":"","required":false,"desc":"(Optional) ID of job.","default":"","type":"string","name":"job_id","in":"body"}],"url":"https:/api.semantria.com/collection.[json|xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-09-16T20:20:29.541Z","editedParams":true,"editedParams2":true,"excerpt":"This method submits an array of documents to be analyzed in relation to each other and returns one output. Discovery analysis will contain a summary of sentiment, named entity extraction, themes, and categorization for all the documents in the collection.\n\nA collection consists of an array of elements: (optional) ID, (optional) tag and an array of pieces of text.\n\nYou can use a job_id to separate specific environment (such as dev vs QA). Collections submitted with a job_id must be retrieved via that same job_id.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":42,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"sending-a-collection-by-job_id","sync_unique":"","title":"Sending a collection by job_id","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postSending a collection by job_id

This method submits an array of documents to be analyzed in relation to each other and returns one output. Discovery analysis will contain a summary of sentiment, named entity extraction, themes, and categorization for all the documents in the collection. A collection consists of an array of elements: (optional) ID, (optional) tag and an array of pieces of text. You can use a job_id to separate specific environment (such as dev vs QA). Collections submitted with a job_id must be retrieved via that same job_id.

Body JSON

config_id:
string
(Optional) ID of config to use
job_id:
string
(Optional) ID of job.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d319","api":{"examples":{"codes":[{"language":"text","code":"GET https://api.semantria.com/collection/processed.json?config_id=cd2e7341-a3c2-\n  4fb4-9d3a-779e8b0a5eff","name":"config_id"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getProcessedCollections( config_id = \"id\" )"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and served.\n[\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Collection\" section\n   },\n   {\n      \"id\" : \"s8k5441-ar62-4f24-95wt-479e845d6csf\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Collection\" section\n   }\n] ","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":202,"language":"xml","code":"HTTP/1.0 202 Request accepted and served.\n<collections>\n  <collection>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <!- Accompanying output as described in the “Request Collection” section ->\n  </collection>\n\n  <collection>\n    <id> s8k5441-ar62-4f24-95wt-479e845d6csf </id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <!- Accompanying output as described in the “Request Collection” section ->\n  </collection>\n</collections>"}]},"settings":"","auth":"required","params":[{"_id":"55aed248555b900d0036d1a9","ref":"","required":false,"desc":"return processed documents from a particular config id. If the config_id is not provided, the API uses the primary configuration id by default","default":"","type":"string","name":"config_id","in":"query"},{"_id":"55aed248555b900d0036d1a7","ref":"","required":false,"desc":"return only the document with this ID","default":"","type":"string","name":"document_id","in":"query"}],"url":"https://api.semantria.com/collection/processed.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-07-21T23:14:16.398Z","editedParams":true,"editedParams2":true,"excerpt":"Retrieving: Returning any and all processed collections","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":43,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"retrieving-processed-discovery-analyses","sync_unique":"","title":"Retrieving processed discovery analyses","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRetrieving processed discovery analyses

Retrieving: Returning any and all processed collections

Query Params

config_id:
string
return processed documents from a particular config id. If the config_id is not provided, the API uses the primary configuration id by default
document_id:
string
return only the document with this ID

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d31a","api":{"examples":{"codes":[{"language":"text","code":"GET https://api.semantria.com/collection/processed.json?config_id=cd2e7341-a3c2-\n  4fb4-9d3a-779e8b0a5eff","name":"config_id"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getProcessedCollections( job_id = \"id\" )"}]},"results":{"codes":[{"status":202,"language":"json","code":"HTTP/1.0 202 Request accepted and served.\n[\n   {\n      \"id\" : \"d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Collection\" section\n   },\n   {\n      \"id\" : \"s8k5441-ar62-4f24-95wt-479e845d6csf\",\n      \"config_id\" : \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n      \"tag\" : \"Any text (up to 50 characters) used like a marker.\",\n      \"status\" : \"PROCESSED\"\n      //Accompanying output as described in the \"Request Collection\" section\n   }\n] ","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":202,"language":"xml","code":"HTTP/1.0 202 Request accepted and served.\n<collections>\n  <collection>\n    <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <!- Accompanying output as described in the “Request Collection” section ->\n  </collection>\n\n  <collection>\n    <id> s8k5441-ar62-4f24-95wt-479e845d6csf </id>\n    <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n    <tag>Any text (up to 50 characters) used like a marker.</tag>\n    <!- Accompanying output as described in the “Request Collection” section ->\n  </collection>\n</collections>"}]},"settings":"","auth":"required","params":[{"_id":"55aed248555b900d0036d1a9","ref":"","required":false,"desc":"return processed documents from a particular job id.","default":"","type":"string","name":"job_id","in":"query"}],"url":"https://api.semantria.com/collection/processed.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-09-16T20:17:47.123Z","editedParams":true,"editedParams2":true,"excerpt":"If you submitted by job_id you must also retrieve by job_id.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":44,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"retrieving-processed-discovery-by-job_id","sync_unique":"","title":"Retrieving processed discovery by job_id","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRetrieving processed discovery by job_id

If you submitted by job_id you must also retrieve by job_id.

Query Params

job_id:
string
return processed documents from a particular job id.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d31b","api":{"examples":{"codes":[{"language":"text","code":"GET https://api.semantria.com/collection/d2e7341-a3c2-4fb4-9d3a-\n  779e8b0a5eff.json","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 200 Request accepted and served.\n\n{\n   “id” : “d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff”,\n   “config_id” : “cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff”,\n   “tag” : “Any text (up to 50 characters) used like a marker.”,\n   “status” : “PROCESSED”,\n   “facets” : [\n      {\n         “label” : “Something”,\n         “count” : 10,\n         “negative_count” : 2,\n         “positive_count” : 1,\n         “neutral_count” : 7,\n         “attributes” : [\n            {\n               “label” : “Attribute”,\n               “count” : 5\n               “mentions” : [\n                  {\n                     “label” : “something”,\n                     “is_negated” : true,\n                     “negating_phrase” : “negator”,\n                     ]\n                  }\n               ]\n            }\n         ],\n         “mentions” : [\n            {\n               “label” : “something”,\n               “is_negated” : true,\n                “negating_phrase” : “negator”,\n            }\n         ]\n      }\n   ],\n   “themes” : [\n      {\n         “title” : “republican moderates”,\n         “phrases_count” : 5,\n         “themes_count” : 9,\n         “sentiment_score” : 0.37,\n         “sentiment_polarity” : “positive”,\n         “mentions” : [\n            {\n               “label” : “republican moderates”,\n               “is_negated” : true,\n                “negating_phrase” : “negator”,\n                “locations” : [\n                   {\n                      “index” : 17,\n                       “offset” : 987,\n                       “length” : 9\n                   }\n                ]\n            }\n         ]\n      }\n   ],\n   “entities” : [\n      {\n         “title” : “WASHINGTON”,\n         “label” : “The capital of the United States of America”,\n         “type” : “named”,\n         “entity_type” : “Place”,\n         “count” : 5,\n         “negative_count” : 2,\n         “positive_count” : 1,\n         “neutral_count” : 2,\n         “mentions” : [\n            {\n               “label” : “WASHINGTON”,\n               “is_negated” : true,\n                “negating_phrase” : “negator”,\n                “locations” : [\n                   {\n                      “index” : 17,\n                       “offset” : 987,\n                       “length” : 9\n                   }\n                ]\n            }\n         ]\n      }\n   ],\n   “topics” : [\n      {\n         “title” : “Something”,\n         “type” : “concept”,\n         “hitcount” : 0,\n         “sentiment_score” : 0.6133076,\n         “sentiment_polarity” : “positive”\n      }\n   ]\n}}","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":200,"language":"xml","code":"HTTP/1.0 200 Request accepted and served.\n<collection>\n   <config_id>cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</config_id>\n   <id>d2e7341-a3c2-4fb4-9d3a-779e8b0a5eff</id>\n   <tag>Any text (up to 50 characters) used like a marker.</tag>\n   <status>PROCESSED</status>\n   <facets>\n      <facet>\n         <label>Something</label>\n         <count>10</count>\n         <negative_count>2</negative_count>\n         <positive_count>1</positive_count>\n         <neutral_count>7</neutral_count>\n         <attributes>\n            <attribute>\n               <label>Attribute</label>\n               <count>5</count>\n               <mentions>\n                  <mention>\n                     <label>something</label>\n                     <is_negated>true</is_negated>\n                     <negating_phrase>some</negating_phrase>\n                  </mention>\n               </mentions>\n            </attribute>\n         </attributes>\n         <mentions>\n            <mention>\n               <label>something</label>\n               <is_negated>true</is_negated>\n               <negating_phrase>some</negating_phrase>\n            </mention>\n         </mentions>\n      </facet>\n   </facets>\n   <themes>\n      <theme>\n         <title>republican moderates</title>\n         <phrases_count>5</phrases_count>\n         <themes_count>9</themes_count>\n         <sentiment_score>0.37</sentiment_score>\n         <sentiment_polarity>positive</sentiment_polarity>\n         <mentions>\n            <mention>\n               <label>something</label>\n               <is_negated>true</is_negated>\n               <negating_phrase>some</negating_phrase>\n               <locations>\n                  <location>\n                     <offset>987</offset>\n                     <length>9</length>\n                     <index>17</index>\n                  </location>\n               </locations>\n            </mention>\n         </mentions>\n      </theme>\n   </themes>\n   <entities>\n      <entity>\n         <title>WASHINGTON</title>\n         <label>The capital of the United States of America.</label>\n         <type>named</type>\n         <entity_type>Place</entity_type>\n         <count>5</count>\n         <negative_count>2</negative_count>\n         <positive_count>1</positive_count>\n         <neutral_count>2</neutral_count>\n         <mentions>\n            <mention>\n               <label>something</label>\n               <is_negated>true</is_negated>\n               <negating_phrase>some</negating_phrase>\n               <locations>\n                  <location>\n                     <offset>987</offset>\n                     <length>9</length>\n                     <index>17</index>\n                  </location>\n               </locations>\n            </mention>\n         </mentions>\n      </entity>\n   </entities>\n   <topics>\n      <topic>\n         <title>Something</title>\n         <hitcount>0</hitcount>\n         <sentiment_score>0.6133076</sentiment_score>\n         <sentiment_polarity>positive</sentiment_polarity>\n         <type>concept</type>\n      </topic>\n   </topics>\n</collection>"}]},"settings":"","auth":"required","params":[{"_id":"55aed0c4826d210d00041d68","ref":"","required":true,"desc":"the collection's id","default":"","type":"string","name":"collection_id","in":"path"}],"url":"https://api.semantria.com/collection/:collection_id.[json|xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-07-21T23:07:48.616Z","editedParams":true,"editedParams2":true,"excerpt":"Requesting: Asking the status (or processed results) of a specific collection","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":45,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"requesting-discovery-analysis","sync_unique":"","title":"Requesting specific discovery analysis","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getRequesting specific discovery analysis

Requesting: Asking the status (or processed results) of a specific collection

Path Params

collection_id:
required
string
the collection's id

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d31c","api":{"examples":{"codes":[{"language":"text","code":"DELETE https://api.semantria.com/collection/d2e7341-a3c2-4fb4-9d3a-\n779e8b0a5eff.json","name":""}]},"results":{"codes":[{"status":202,"language":"text","code":"HTTP/1.0 202 Request accepted and served.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55afe859d7624e3700e4fae9","ref":"","required":true,"desc":"","default":"","type":"string","name":"collection_id","in":"path"}],"url":"https://api.semantria.com/collection/:collection_id.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b2","createdAt":"2015-07-22T19:00:41.823Z","editedParams":true,"editedParams2":true,"excerpt":"Canceling: deleting a queued document if Semantria has not processed it yet.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":46,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"canceling-discovery-analyses","sync_unique":"","title":"Canceling discovery analyses","type":"delete","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteCanceling discovery analyses

Canceling: deleting a queued document if Semantria has not processed it yet.

Path Params

collection_id:
required
string

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d2ff","api":{"examples":{"codes":[{"name":"","code":"POST https://api.semantria.com/configurations.json\n[\n   \"cd2e7341-a3c2-4fb4-9d3a-779e8b0a5eff\",\n   \"b1df666c-df5e-44c3-86f6-d1f331024f19\"\n]","language":"json"},{"code":"POST https://api.semantria.com/configurations.xml\n<configurations>\n   <configuration>\n      <config_id> </config_id>\n      <name>New test configuration</name>\n      <is_primary>true</is_primary>\n      <auto_response>false</auto_response>\n      <language>English</language>\n      <chars_threshold>80</chars_threshold>\n\t  <entities_threshold>0</entities_threshold>\n      <one_sentence>false</one_sentence>\n      <process_html>false</process_html>\n      <callback>https://anyapi.anydomain.com/processed.json</callback>\n       <document>\n         <pos_types>Noun,Verb,Adjective</pos_types>\n         <phrases_limit>10</phrases_limit>\n         <possible_phrases_limit>10</possible_phrases_limit>\n         <auto_categories_limit>5</auto_categories_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <user_entities_limit>5</user_entities_limit>\n         <entity_themes_limit>5</entity_themes_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <user_mentions_limit>0</user_mentions_limit>\n         <named_relations_limit>0</named_relations_limit>\n         <user_relations_limit>0</user_relations_limit>\n         <named_opinions_limit>0</named_opinions_limit>\n         <user_opinions_limit>0</user_opinions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n         <summary_limit>0</summary_limit>\n         <detect_language>true</detect_language>\n      </document>\n      <collection>\n         <facets_limit>15</facets_limit>\n         <facet_atts_limit>5</facet_atts_limit>\n         <facet_mentions_limit>0</facet_mentions_limit>\n         <attribute_mentions_limit>0</attribute_mentions_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n\t\t <user_entities_limit>0</user_entities_limit>\n\t\t <user_mentions_limit>0</user_mentions_limit>\n      </collection>\n   </configuration>\n</configurations>","language":"xml"}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 200 Request accepted and served.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55e9a8250c9b420d0042b255","ref":"","required":false,"desc":"List of configuration IDs to delete","default":"","type":"array_string","name":"body","in":"body"}],"url":"/configurations.[json | xml]"},"body":"The configuration endpoint allows you to manage your Semantria configurations. For all operations except DELETE, if you do not send in a configuration ID, the action will apply to your PRIMARY configuration. You cannot delete your PRIMARY configuration.\n\nEach configuration has a language associated with it. This language cannot be changed once the configuration is established.\n\nEach of the NLP outputs (such as themes, entities and so on) can be turned off by setting the corresponding output limit value to 0.","category":"577e4bf24159cd1900d5d2b3","createdAt":"2015-07-07T21:28:32.436Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":47,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"configuration","sync_unique":"","title":"Configuration Basics","type":"basic","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

Configuration Basics


The configuration endpoint allows you to manage your Semantria configurations. For all operations except DELETE, if you do not send in a configuration ID, the action will apply to your PRIMARY configuration. You cannot delete your PRIMARY configuration. Each configuration has a language associated with it. This language cannot be changed once the configuration is established. Each of the NLP outputs (such as themes, entities and so on) can be turned off by setting the corresponding output limit value to 0.
The configuration endpoint allows you to manage your Semantria configurations. For all operations except DELETE, if you do not send in a configuration ID, the action will apply to your PRIMARY configuration. You cannot delete your PRIMARY configuration. Each configuration has a language associated with it. This language cannot be changed once the configuration is established. Each of the NLP outputs (such as themes, entities and so on) can be turned off by setting the corresponding output limit value to 0.
{"category":"577e4bf24159cd1900d5d2b3","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d300","createdAt":"2015-09-04T18:20:27.685Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getConfigurations()","name":""},{"language":"http","code":"https://api.semantria.com/configurations.json"},{"language":"csharp","code":" using (Session session = Session.CreateSession(consumerKey, consumerSecret, serializer))\n List<dynamic> configs = session.GetConfigurations()"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n   {\n      \"name\": \"New test configuration\",\n\t \t\t\"language\": \"English\",\n\t  \t\"config_id\" : \"\",\n      \"is_primary\" : true,\n\t  \t\"document\": {\n        \"intentions\": false,\n        \"concept_topics\": true,\n        \"query_topics\": true,\n        \"detect_language\": true,\n        \"themes\": false,\n        \"named_entities\": true,\n        \"sentiment_phrases\": true,\n        \"user_entities\": true,\n        \"pos_types\": \"\",\n        \"summary_size\": 20,\n        \"relations\" : true,\n        \"mentions\": false,\n        \"opinions\" : true,\n        \"auto_categories\": true,\n        \"model_sentiment\" : false\n\t  },\n     \"auto_response\": false,\n     \"is_primary\": false,\n     \"concept_topics_threshold\": 0.45,\n     \"entities_threshold\": 0,\n     \"collection\": {\n      \"concept_topics\": true,\n      \"query_topics\": true,\n      \"named_entities\": true,\n      \"user_entities\": true,\n      \"mentions\" : false,\n      \"attributes\": true,\n      \"facets\": true,\n      \"themes\": true,\n\t  },\n     \"process_html\": false,\n     \"alphanumeric_threshold\": 30,\n     \"one_sentence_mode\": false\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/configurations.[json | xml]"},"isReference":false,"order":48,"body":"","excerpt":"The results listing here shows every settable option in a configuration. You do not have to submit all of these values to modify specific values of a configuration.","slug":"listing-existing-configurations","type":"get","title":"Listing existing configurations","__v":0,"childrenPages":[]}

getListing existing configurations

The results listing here shows every settable option in a configuration. You do not have to submit all of these values to modify specific values of a configuration.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d301","api":{"examples":{"codes":[{"language":"json","code":"POST https://api.semantria.com/configurations.json\n[\n   {\n      \"name\" : \"New test configuration\",\n      \"is_primary\" : true,\n      \"auto_response\" : false,\n      \"language\" : \"English\",\n      \"alphanumeric_threshold\" : 80,\n      \"categories_threshold\" : 0.45,\n\t  \t\"entities_threshold\" : 0,\n      \"one_sentence_mode\" : false,\n      \"process_html\" : false,\n      \"callback\" : \"https://anyapi.anydomain.com/processed.json\",\n      \"document\" : {\n         \"intentions\" : false,\n         \"pos_types\" : \"Noun,Verb,Adjective\",\n         \"sentiment_phrases\" : true,\n         \"auto_categories\" : true,\n         \"concept_topics\" : true,\n         \"query_topics\" : true,\n         \"named_entities\" : true,\n         \"user_entities\" : true,\n         \"mentions\" : false,\n         \"relations\" : false,\n         \"opinions\" : false,\n         \"themes\" : false,\n         \"summary_size\" : 0,\n         \"detect_language\" : true\n      },\n      \"collection\" : {\n         \"facets\" : true,\n         \"attributes\" : true,\n         \"mentions\" : false,\n         \"concept_topics\" : true,\n         \"query_topics\" : true,\n         \"named_entities\" : true,\n         \"themes\" : false,\n\t\t\t\t \"user_entitities\" : true,\n      }\n   }\n]","name":""},{"language":"xml","code":"POST https://api.semantria.com/configurations.xml\n<configurations>\n   <configuration>\n      <config_id> </config_id>\n      <name>New test configuration</name>\n      <is_primary>true</is_primary>\n      <auto_response>false</auto_response>\n      <language>English</language>\n      <chars_threshold>80</chars_threshold>\n\t  <entities_threshold>0</entities_threshold>\n      <one_sentence>false</one_sentence>\n      <process_html>false</process_html>\n      <callback>https://anyapi.anydomain.com/processed.json</callback>\n       <document>\n         <pos_types>Noun,Verb,Adjective</pos_types>\n         <phrases_limit>10</phrases_limit>\n         <possible_phrases_limit>10</possible_phrases_limit>\n         <auto_categories_limit>5</auto_categories_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <user_entities_limit>5</user_entities_limit>\n         <entity_themes_limit>5</entity_themes_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <user_mentions_limit>0</user_mentions_limit>\n         <named_relations_limit>0</named_relations_limit>\n         <user_relations_limit>0</user_relations_limit>\n         <named_opinions_limit>0</named_opinions_limit>\n         <user_opinions_limit>0</user_opinions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n         <summary_limit>0</summary_limit>\n         <detect_language>true</detect_language>\n      </document>\n      <collection>\n         <facets_limit>15</facets_limit>\n         <facet_atts_limit>5</facet_atts_limit>\n         <facet_mentions_limit>0</facet_mentions_limit>\n         <attribute_mentions_limit>0</attribute_mentions_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n\t\t <user_entities_limit>0</user_entities_limit>\n\t\t <user_mentions_limit>0</user_mentions_limit>\n      </collection>\n   </configuration>\n</configurations>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key,secret)\nsession.addConfigurations({ \n    \"name\" : \"myConfig\", \n    \"auto_response\" : \"false\" , \n    \"language\" : \"French\" , \n    \"document\" : \n    { \n      \"detect_language\" : \"true\" \n    }, \n    \"collection\" : \n    { \n      \"facets_limit\" : 5 \n    } \n  }\n                         )"}]},"results":{"codes":[{"name":"","code":"HTTP/1.0 202 Request accepted and served.","language":"json","status":202},{"name":"Empty name field","code":"{ 'status' : 400, 'message': 'Configurations name is empty' }","language":"json","status":400},{"name":"Language not purchased","code":"{'status' : 403, 'message' : The English language isn\\'t supported by your license type. Please contact Semantria support (support@semantria.com) for details.' }","language":"json","status":403},{"name":"Feature not purchased","status":403,"language":"json","code":"{ 'status' : 403, 'message' : 'The facets (Discovery mode) feature isn\\'t supported by your license type. Please contact Semantria support (support@semantria.com) for details.' }"}]},"settings":"","auth":"required","params":[],"url":"/configurations.json"},"body":"","category":"577e4bf24159cd1900d5d2b3","createdAt":"2015-12-15T20:24:48.703Z","excerpt":"When creating a configuration, only a few fields are mandatory to set. These are:\n\n--name\n--is_primary\n--language\n\nThe complete list of settable values, their types and defaults, are listed here: [configuration values](doc:configuration-values).","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":49,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"creating-configurations","sync_unique":"","title":"Creating Configurations","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreating Configurations

When creating a configuration, only a few fields are mandatory to set. These are: --name --is_primary --language The complete list of settable values, their types and defaults, are listed here: [configuration values](doc:configuration-values).

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d302","api":{"examples":{"codes":[{"language":"json","code":"POST https://api.semantria.com/configurations.json\n[\n   {\n      \"config_id\" : \"ID\",\n      \"name\" : \"New test configuration\",\n      \"is_primary\" : true,\n      \"auto_response\" : false,\n      \"language\" : \"English\",\n      \"alphanumeric_threshold\" : 80,\n\t  \t\"entities_threshold\" : 0,\n      \"one_sentence_mode\" : false,\n      \"process_html\" : false,\n      \"callback\" : \"https://anyapi.anydomain.com/processed.json\",\n      \"document\" : {\n         \"pos_types\" : \"Noun,Verb,Adjective\",\n         \"sentiment_phrases\" : true,\n         \"auto_categories\" : true,\n         \"concept_topics\" : true,\n         \"query_topics\" : true,\n         \"named_entities\" : true,\n         \"user_entities\" : true,\n         \"mentions\" : false,\n         \"relations\" : false,\n         \"opinions\" : false,\n         \"themes\" : false,\n         \"summary_size\" : 0,\n         \"detect_language\" : true\n      },\n      \"collection\" : {\n         \"facets\" : true,\n         \"attributes\" : true,\n         \"mentions\" : false,\n         \"concept_topics\" : true,\n         \"query_topics\" : true,\n         \"named_entities\" : true,\n         \"themes_limit\" : false,\n\t\t     \"user_entitities\" : false,\n      }\n   }\n]","name":""},{"language":"xml","code":"POST https://api.semantria.com/configurations.xml\n<configurations>\n   <configuration>\n      <config_id> </config_id>\n      <name>New test configuration</name>\n      <is_primary>true</is_primary>\n      <auto_response>false</auto_response>\n      <language>English</language>\n      <chars_threshold>80</chars_threshold>\n\t  <entities_threshold>0</entities_threshold>\n      <one_sentence>false</one_sentence>\n      <process_html>false</process_html>\n      <callback>https://anyapi.anydomain.com/processed.json</callback>\n       <document>\n         <pos_types>Noun,Verb,Adjective</pos_types>\n         <phrases_limit>10</phrases_limit>\n         <possible_phrases_limit>10</possible_phrases_limit>\n         <auto_categories_limit>5</auto_categories_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <user_entities_limit>5</user_entities_limit>\n         <entity_themes_limit>5</entity_themes_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <user_mentions_limit>0</user_mentions_limit>\n         <named_relations_limit>0</named_relations_limit>\n         <user_relations_limit>0</user_relations_limit>\n         <named_opinions_limit>0</named_opinions_limit>\n         <user_opinions_limit>0</user_opinions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n         <summary_limit>0</summary_limit>\n         <detect_language>true</detect_language>\n      </document>\n      <collection>\n         <facets_limit>15</facets_limit>\n         <facet_atts_limit>5</facet_atts_limit>\n         <facet_mentions_limit>0</facet_mentions_limit>\n         <attribute_mentions_limit>0</attribute_mentions_limit>\n         <concept_topics_limit>5</concept_topics_limit>\n         <query_topics_limit>5</query_topics_limit>\n         <named_entities_limit>5</named_entities_limit>\n         <named_mentions_limit>0</named_mentions_limit>\n         <themes_limit>0</themes_limit>\n         <theme_mentions_limit>0</theme_mentions_limit>\n\t\t <user_entities_limit>0</user_entities_limit>\n\t\t <user_mentions_limit>0</user_mentions_limit>\n      </collection>\n   </configuration>\n</configurations>"},{"language":"python","code":"import semantria\nsession = semantria.Session(key,secret)\nsession.updateConfigurations(\n  { \n    \"config_id\" : \"id\"\n    \"name\" : \"myConfig\", \n    \"auto_response\" : \"false\" , \n    \"language\" : \"French\" , \n    \"document\" : \n    { \n      \"detect_language\" : \"true\" \n    }, \n    \"collection\" : \n    { \n      \"facets_limit\" : 5 \n    } \n  }\n                         )"}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 200 Request accepted and served.","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/configurations.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b3","createdAt":"2015-09-04T18:17:42.714Z","excerpt":"Note a complete list of the settable values associated with configurations can be found here: [configuration values](doc:configuration-values). Note that when modifying a config you only need to send the values you wish to modify, you do not need to send all values.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":50,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"modifying-configurations","sync_unique":"","title":"Modifying Configurations","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putModifying Configurations

Note a complete list of the settable values associated with configurations can be found here: [configuration values](doc:configuration-values). Note that when modifying a config you only need to send the values you wish to modify, you do not need to send all values.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d303","api":{"examples":{"codes":[{"name":"","code":"  { \"name\" = \"My new name\",\n    \"template\" = \"id\"\n  }","language":"http"},{"code":"import semantria\nsession = semantria.Session(key,secret)\nsession.cloneConfiguration( \n  { \"name\" = \"My new name\",\n    \"template\" = \"id\"\n  }\n)","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55e9e1f00b48a121002ab3a6","ref":"","required":false,"desc":"enter the ID of the configuration you wish to clone","default":"","type":"string","name":"template","in":"body"},{"_id":"55e9e1f00b48a121002ab3a5","ref":"","required":false,"desc":"name of the new clone","default":"","type":"string","name":"name","in":"body"}],"url":"/configurations.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b3","createdAt":"2015-09-04T18:23:35.225Z","editedParams":true,"editedParams2":true,"excerpt":"This makes an exact copy of an existing configuration","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":51,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"cloning-a-configuration","sync_unique":"","title":"Cloning a configuration","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCloning a configuration

This makes an exact copy of an existing configuration

Body JSON

template:
string
enter the ID of the configuration you wish to clone
name:
string
name of the new clone

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2b3","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d304","createdAt":"2015-09-04T18:25:59.820Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.removeConfigurations(config_id)","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{ 'status' : 400, 'message' : 'Configuration marked for delete does not exist.'","name":"Wrong ID to delete"}]},"settings":"","auth":"required","params":[],"url":"/configurations.[json | xml]"},"isReference":false,"order":52,"body":"","excerpt":"Send a list of config IDs to be deleted","slug":"delete","type":"delete","title":"Delete","__v":0,"childrenPages":[]}

deleteDelete

Send a list of config IDs to be deleted

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2b4","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d30d","createdAt":"2015-10-21T20:17:48.505Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":53,"body":"Semantria comes with pre-configured templates to aid in configuration creation. A template consists of a language, API settings (such as one_sentence mode) and NLP tuning (such as pre-built query topics or sentiment phrases). Some templates are free and include basic samples of NLP tuning. Some templates are for-purchase and contain industry-specific taxonomies and sentiment dictionaries.\n\nWhen creating a configuration, you can specify an ID of a template and the elements of the template will show up in the created configuration.\n\nTemplates are read-only and new templates can only be created by Semantria.","excerpt":"","slug":"template-basics","type":"basic","title":"Template basics","__v":0,"childrenPages":[]}

Template basics


Semantria comes with pre-configured templates to aid in configuration creation. A template consists of a language, API settings (such as one_sentence mode) and NLP tuning (such as pre-built query topics or sentiment phrases). Some templates are free and include basic samples of NLP tuning. Some templates are for-purchase and contain industry-specific taxonomies and sentiment dictionaries. When creating a configuration, you can specify an ID of a template and the elements of the template will show up in the created configuration. Templates are read-only and new templates can only be created by Semantria.
Semantria comes with pre-configured templates to aid in configuration creation. A template consists of a language, API settings (such as one_sentence mode) and NLP tuning (such as pre-built query topics or sentiment phrases). Some templates are free and include basic samples of NLP tuning. Some templates are for-purchase and contain industry-specific taxonomies and sentiment dictionaries. When creating a configuration, you can specify an ID of a template and the elements of the template will show up in the created configuration. Templates are read-only and new templates can only be created by Semantria.
{"category":"577e4bf24159cd1900d5d2b4","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d30e","createdAt":"2015-10-21T20:22:35.860Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"language":"http","code":"https://api.semantria.com/templates.json","name":""},{"language":"python","code":""}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"auto_responding\": false,\n        \"categories_threshold\": 0.45,\n        \"chars_threshold\": 80,\n        \"collection\": {\n            \"attribute_mentions_limit\": 0,\n            \"concept_topics_limit\": 5,\n            \"facet_atts_limit\": 5,\n            \"facet_mentions_limit\": 0,\n            \"facets_limit\": 15,\n            \"named_entities_limit\": 5,\n            \"named_mentions_limit\": 0,\n            \"query_topics_limit\": 5,\n            \"theme_mentions_limit\": 0,\n            \"themes_limit\": 5,\n            \"user_entities_limit\": 0,\n            \"user_mentions_limit\": 0\n        },\n        \"config_id\": \"0581e02fb27066ac973c182545693f3e\",\n        \"document\": {\n            \"auto_categories_limit\": 5,\n            \"concept_topics_limit\": 5,\n            \"detect_language\": true,\n            \"entity_themes_limit\": 0,\n            \"intentions\": false,\n            \"model_sentiment\": false,\n            \"named_entities_limit\": 5,\n            \"named_mentions_limit\": 0,\n            \"named_opinions_limit\": 0,\n            \"named_relations_limit\": 0,\n            \"phrases_limit\": 0,\n            \"possible_phrases_limit\": 0,\n            \"query_topics_limit\": 5,\n            \"summary_limit\": 3,\n            \"theme_mentions_limit\": 0,\n            \"themes_limit\": 5,\n            \"user_entities_limit\": 5,\n            \"user_mentions_limit\": 0,\n            \"user_opinions_limit\": 0,\n            \"user_relations_limit\": 0\n        },\n        \"entities_threshold\": 55,\n        \"is_primary\": false,\n        \"language\": \"English\",\n        \"modified\": 0,\n        \"name\": \"default_template_config_9438637610992\",\n        \"one_sentence\": false,\n        \"process_html\": false,\n        \"rights\": []\n    },\n  {\n        \"auto_responding\": false,\n        \"categories_threshold\": 0.45,\n        \"chars_threshold\": 80,\n        \"collection\": {\n            \"attribute_mentions_limit\": 0,\n            \"concept_topics_limit\": 5,\n            \"facet_atts_limit\": 5,\n            \"facet_mentions_limit\": 0,\n            \"facets_limit\": 15,\n            \"named_entities_limit\": 5,\n            \"named_mentions_limit\": 0,\n            \"query_topics_limit\": 5,\n            \"theme_mentions_limit\": 0,\n            \"themes_limit\": 5,\n            \"user_entities_limit\": 0,\n            \"user_mentions_limit\": 0\n        },\n        \"config_id\": \"cba6daae76d64cc1658593f22fa0555b\",\n        \"document\": {\n            \"auto_categories_limit\": 5,\n            \"concept_topics_limit\": 5,\n            \"detect_language\": true,\n            \"entity_themes_limit\": 0,\n            \"intentions\": false,\n            \"model_sentiment\": false,\n            \"named_entities_limit\": 5,\n            \"named_mentions_limit\": 0,\n            \"named_opinions_limit\": 0,\n            \"named_relations_limit\": 0,\n            \"phrases_limit\": 0,\n            \"possible_phrases_limit\": 0,\n            \"query_topics_limit\": 5,\n            \"summary_limit\": 3,\n            \"theme_mentions_limit\": 0,\n            \"themes_limit\": 5,\n            \"user_entities_limit\": 5,\n            \"user_mentions_limit\": 0,\n            \"user_opinions_limit\": 0,\n            \"user_relations_limit\": 0\n        },\n        \"entities_threshold\": 55,\n        \"is_primary\": false,\n        \"language\": \"English\",\n        \"modified\": 0,\n        \"name\": \"vp: hotel, v1\",\n        \"one_sentence\": false,\n        \"process_html\": false,\n        \"rights\": [],\n        \"version\": \"1.3.2\"\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/templates.json"},"isReference":false,"order":54,"body":"","excerpt":"","slug":"list-templates","type":"get","title":"List Templates","__v":0,"childrenPages":[]}

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2b5","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2dc","createdAt":"2015-09-04T18:55:54.717Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":55,"body":"You can create queries per configuration up to the limit specified in your subscription. Queries are referred to by name. Queries can be a certain number of characters long, up to the value specified in your subscription.","excerpt":"","slug":"query-basics","type":"basic","title":"Query Basics","__v":0,"childrenPages":[]}

Query Basics


You can create queries per configuration up to the limit specified in your subscription. Queries are referred to by name. Queries can be a certain number of characters long, up to the value specified in your subscription.
You can create queries per configuration up to the limit specified in your subscription. Queries are referred to by name. Queries can be a certain number of characters long, up to the value specified in your subscription.
{"__v":0,"_id":"577e4bf24159cd1900d5d2dd","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.getQueries(config_id = \"id\")","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n   {\n      \"id\": \"ec889834-2498-4909-b90d-db93f9a06a6a\", \n      \"modified\": 1450195636, \n      \"name\" : \"Feature: Cloud service\",\n      \"query\" : \"Amazon AND EC2 AND Cloud\"\n   }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed513d6e2c62b001ac0e1","ref":"","required":false,"desc":"ID of configuration you wish to create queries in","default":"","type":"string","name":"config_id","in":"query"}],"url":"/queries.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b5","createdAt":"2015-09-08T12:31:15.848Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":56,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-queries-1","sync_unique":"","title":"List queries","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList queries


Query Params

config_id:
string
ID of configuration you wish to create queries in

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2de","api":{"examples":{"codes":[{"name":"","code":"[\n   {\n      \"name\" : \"Cloud Services\",\n      \"query\" : \"Amazon OR EC2 OR Cloud\"\n   }\n]","language":"json"},{"code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.addQueries( \n  { \n    \"name\" : \"Cloud Services\", \n    \"query\" : \"Amazon OR EC2 OR Cloud\" \n  }, \n  config_id = \"id\" \n)","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"id\": \"3463f27c-becd-4de2-a11b-05061e040e41\", \n        \"modified\": 1450195636, \n        \"name\": \"Cloud Services\", \n        \"query\": \"Amazon OR EC2 OR Cloud\"\n    }\n]","name":""},{"status":400,"language":"json","code":"{ 'status' : 400, 'message' : 'Query \"Feature: Cloud service\" has an error. Error in line 1, column 4 : Syntax error Illegal character \\'\"<\"\\' found in line: line: 1, col: col: 2'","name":"Query syntax error"},{"name":"Too many queries in config","status":406,"language":"json","code":"{ 'status' : 406, 'message' : 'The number of permitted queries per configuration has been reached.'"},{"code":"Length of query [Amazon AND (cloud OR service)] is exceeding the limit of 54 characters.","language":"text","status":400,"name":"Query too long"}]},"settings":"","auth":"required","params":[{"_id":"55e9a5d57f564237001d5bce","ref":"","required":false,"desc":"ID of configuration containing queries you wish to list.","default":"","type":"string","name":"config_id","in":"body"}],"url":"/queries.[ json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b5","createdAt":"2015-09-04T18:56:30.301Z","editedParams":true,"editedParams2":true,"excerpt":"Pass an object within the body of the request with a param","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":57,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-queries","sync_unique":"","title":"Create Queries","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate Queries

Pass an object within the body of the request with a param

Body JSON

config_id:
string
ID of configuration containing queries you wish to list.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2df","api":{"examples":{"codes":[{"language":"json","code":"[\n   {\n      \"id\": \"3463f27c-becd-4de2-a11b-05061e040e41\",\n      \"name\" : \"Cloud Services\",\n      \"query\" : \"EC2 OR \\(Amazon AND Cloud\\)\"\n   }\n]","name":""},{"code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.updateQueries( \n  { \n    \"id\": \"3463f27c-becd-4de2-a11b-05061e040e41\",\n    \"name\" : \"Cloud Services\",\n    \"query\" : \"EC2 OR \\)Amazon AND cloud\\)\"\n  }\n  config_id = \"id\"\n)","language":"python"}]},"results":{"codes":[{"name":"","code":"[\n    {\n        \"id\": \"3463f27c-becd-4de2-a11b-05061e040e41\", \n        \"modified\": 1450195965, \n        \"name\": \"Cloud Services\", \n        \"query\": \"EC2 OR \\(Amazon AND Cloud\\)\"\n    }\n]","language":"python","status":200},{"name":"","code":"{}","language":"json","status":400}]},"settings":"","auth":"required","params":[{"_id":"55e9a5d57f564237001d5bce","ref":"","required":false,"desc":"ID of configuration containing queries you wish to update","default":"","type":"string","name":"config_id","in":"body"}],"url":"/queries.json"},"body":"","category":"577e4bf24159cd1900d5d2b5","createdAt":"2015-09-08T12:21:28.564Z","editedParams":true,"editedParams2":true,"excerpt":"Pass a JSON-encoded object within the body of the request. A list of query, text key and value pairs. The text of the query submitted will replace an existing query.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":58,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-queries","sync_unique":"","title":"Update Queries","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate Queries

Pass a JSON-encoded object within the body of the request. A list of query, text key and value pairs. The text of the query submitted will replace an existing query.

Body JSON

config_id:
string
ID of configuration containing queries you wish to update

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2e0","api":{"examples":{"codes":[{"name":"","code":"[\n   \"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\"\n]","language":"json"},{"code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.removeQueries([\"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\"], config_id = \"id\")","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed3966ec7282b00e30257","ref":"","required":false,"desc":"ID of the configuration holding the queries","default":"","type":"string","name":"config_id","in":"body"}],"url":"/queries.json"},"body":"","category":"577e4bf24159cd1900d5d2b5","createdAt":"2015-09-08T12:24:54.377Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":59,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-queries","sync_unique":"","title":"Delete Queries","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete Queries


Body JSON

config_id:
string
ID of the configuration holding the queries

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"category":"577e4bf24159cd1900d5d2b6","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c4","createdAt":"2015-09-08T12:28:24.436Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":60,"body":"You can create categories per configuration up to the limit of categories specified in your subscription. Categories are referred to by name. Categories have three values - name, weight and samples.","excerpt":"","slug":"category-basics","type":"basic","title":"Category Basics","__v":0,"childrenPages":[]}

Category Basics


You can create categories per configuration up to the limit of categories specified in your subscription. Categories are referred to by name. Categories have three values - name, weight and samples.
You can create categories per configuration up to the limit of categories specified in your subscription. Categories are referred to by name. Categories have three values - name, weight and samples.
{"__v":0,"_id":"577e4bf24159cd1900d5d2c5","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.getCategories( config_id = \"id\" )","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n   {\n      \"name\" : \"Feature: Cloud service\",\n      \"weight\" : 0.75,\n      \"samples\" : [ \"Amazon\" , \"EC2\" ]\n   }\n ]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed4f1d6e2c62b001ac0df","ref":"","required":false,"desc":"ID of configuration you want to list categories for","default":"","type":"string","name":"config_id","in":"query"}],"url":"/categories.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b6","createdAt":"2015-09-08T12:30:41.872Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":61,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-categories","sync_unique":"","title":"List Categories","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList Categories


Query Params

config_id:
string
ID of configuration you want to list categories for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2c6","api":{"examples":{"codes":[{"name":"","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.addCategories( \n  {\n    \"name\" : \"Food\",\n    \"weight\" : 1,\n    \"samples\" : \"food, restaurant\"\n  }\n  config_id = \"id\"\n)","language":"python"},{"language":"json","code":"[  \n{\n    \"name\" : \"Food\",\n    \"weight\" : 1,\n    \"samples\" : [\"food\", \"restaurant\"]\n}\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"id\": \"b09933f2-d274-4711-8ff5-2d994ce41c5a\", \n        \"modified\": 1450197417, \n        \"name\": \"Food\", \n        \"samples\": [\n            \"food\", \n            \"restaurant\"\n        ], \n        \"weight\": 1.0\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed4c4c93e8c17008a10ed","ref":"","required":false,"desc":"ID of the configuration you wish to create categories in","default":"","type":"string","name":"config_id","in":"body"}],"url":"/categories.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b6","createdAt":"2015-09-08T12:29:56.247Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":62,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"create-categories","sync_unique":"","title":"Create Categories","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate Categories


Body JSON

config_id:
string
ID of the configuration you wish to create categories in

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2c7","api":{"examples":{"codes":[{"name":"","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.updateCategories( \n  {\n    \"id\":\"b09933f2-d274-4711-8ff5-2d994ce41c5a\",\n    \"name\" : \"Food\",\n    \"weight\" : 1,\n    \"samples\" : \"food, restaurant, wine\"\n  },\n  config_id = \"id\"\n)","language":"python"},{"language":"json","code":"[\n  {\n    \"id\":\"b09933f2-d274-4711-8ff5-2d994ce41c5a\",\n    \"name\" : \"Food\",\n    \"weight\" : 1,\n    \"samples\" : [\"food\", \"restaurant\", \"wine\"]\n  }\n]"}]},"results":{"codes":[{"name":"","code":"[\n    {\n        \"id\": \"b09933f2-d274-4711-8ff5-2d994ce41c5a\", \n        \"modified\": 1450197726, \n        \"name\": \"Food\", \n        \"samples\": [\n            \"food\", \n            \"restaurant\", \n            \"wine\"\n        ], \n        \"weight\": 1.0\n    }\n]","language":"json","status":200},{"name":"","code":"{}","language":"json","status":400}]},"settings":"","auth":"required","params":[{"_id":"55eed4c4c93e8c17008a10ed","ref":"","required":false,"desc":"ID of the configuration you wish to update categories in","default":"","type":"string","name":"config_id","in":"body"}],"url":"/categories.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b6","createdAt":"2015-09-08T12:31:43.509Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":63,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-categories","sync_unique":"","title":"Update Categories","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate Categories


Body JSON

config_id:
string
ID of the configuration you wish to update categories in

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2c8","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.removeCategories( \n[ \"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\" ],\n  config_id = \"id\"\n  )","name":""},{"language":"json","code":"[\n\"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\"\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed4c4c93e8c17008a10ed","ref":"","required":false,"desc":"ID of the configuration you wish to update categories in","default":"","type":"string","name":"config_id","in":"body"}],"url":"/categories.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b6","createdAt":"2015-09-08T12:32:18.472Z","editedParams":true,"editedParams2":true,"excerpt":"Send a list of categories names to be deleted.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":64,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-categories","sync_unique":"","title":"Delete Categories","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete Categories

Send a list of categories names to be deleted.

Body JSON

config_id:
string
ID of the configuration you wish to update categories in

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"category":"577e4bf24159cd1900d5d2b7","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d2fa","createdAt":"2015-10-21T16:40:54.971Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":65,"body":"Taxonomies are supported in version 4.0 and later of Semantria.\n\nYou can create one taxonomy per configuration. By default, the taxonomy can be five levels deep and contain up to the number of queries and categories allowed per config in your license.\n\nA taxonomy node has three values - a name, an id, and whether parent matching should be enforced. The ID is set by the API when the node is created.\n\nIf enforce_parent_matching is set to true for a node, then the node will only be returned if the parent node also matches. For example, if you have:\n\n>Pets (node)\n>>Q Pets:  pet OR domestic\n>>Dogs (node)\n>>>Q Dog: dog\n \nand enforce_parent_matching is set to True for the node Dogs, then this document will match:\n\n*I have a pet dog*\n\nWhile this document:\n\n*I want a hot dog*\n\nwill NOT match.\n\nQueries and categories associated with a node must exist already in your configuration. A taxonomy is just a way of arranging these resources.","excerpt":"Taxonomies provide a way to hierarchically arrange queries and categories in a single structure","slug":"taxonomy-basics","type":"basic","title":"Taxonomy basics","__v":0,"childrenPages":[]}

Taxonomy basics

Taxonomies provide a way to hierarchically arrange queries and categories in a single structure

Taxonomies are supported in version 4.0 and later of Semantria. You can create one taxonomy per configuration. By default, the taxonomy can be five levels deep and contain up to the number of queries and categories allowed per config in your license. A taxonomy node has three values - a name, an id, and whether parent matching should be enforced. The ID is set by the API when the node is created. If enforce_parent_matching is set to true for a node, then the node will only be returned if the parent node also matches. For example, if you have: >Pets (node) >>Q Pets: pet OR domestic >>Dogs (node) >>>Q Dog: dog and enforce_parent_matching is set to True for the node Dogs, then this document will match: *I have a pet dog* While this document: *I want a hot dog* will NOT match. Queries and categories associated with a node must exist already in your configuration. A taxonomy is just a way of arranging these resources.
Taxonomies are supported in version 4.0 and later of Semantria. You can create one taxonomy per configuration. By default, the taxonomy can be five levels deep and contain up to the number of queries and categories allowed per config in your license. A taxonomy node has three values - a name, an id, and whether parent matching should be enforced. The ID is set by the API when the node is created. If enforce_parent_matching is set to true for a node, then the node will only be returned if the parent node also matches. For example, if you have: >Pets (node) >>Q Pets: pet OR domestic >>Dogs (node) >>>Q Dog: dog and enforce_parent_matching is set to True for the node Dogs, then this document will match: *I have a pet dog* While this document: *I want a hot dog* will NOT match. Queries and categories associated with a node must exist already in your configuration. A taxonomy is just a way of arranging these resources.
{"__v":0,"_id":"577e4bf34159cd1900d5d2fb","api":{"examples":{"codes":[{"name":"","code":"http://api.semantria.com/taxonomy.json&config_id=123","language":"http"},{"code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.getTaxonomy(config_id = \"id\")","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"id\": \"a33ba9e0-f720-436d-833f-f8f4babe5600\",\n        \"name\": \"Taste\",\n        \"timestamp\": 1441745387,\n        \"topics\": [\n            {\n                \"id\": \"a2661d73-f5f8-4c1e-8a11-a154507ac494\",\n                \"type\": \"query\"\n            },\n            {\n                \"id\": \"a8652ec0-c32b-4f09-aec5-1d652813317a\",\n                \"type\": \"query\"\n            }\n        ]\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"5627d6e0e2ce610d004e3f63","ref":"","required":false,"desc":"Id of config to list taxonomy for","default":"","type":"string","name":"config_id","in":"query"}],"url":"/taxonomy.json"},"body":"","category":"577e4bf24159cd1900d5d2b7","createdAt":"2015-10-21T18:18:08.579Z","editedParams":true,"editedParams2":true,"excerpt":"Lists the existing taxonomy structure for a given config.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":66,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-taxonomy","sync_unique":"","title":"List Taxonomy","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList Taxonomy

Lists the existing taxonomy structure for a given config.

Query Params

config_id:
string
Id of config to list taxonomy for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d2fc","api":{"examples":{"codes":[{"language":"http","code":"[\n\t{\n\t\t\"name\":\"My Taxonomy\",\n\t\t\"nodes\": [\n\t\t\t{\n\t\t\t\t\"name\":\"Sample Node\",\n\t\t\t\t\"topics\": [\n\t\t\t\t\t{\n\t\t\t\t\t\"id\": \"607ce795-291f-4dd4-8745-8039f5c40b72\",\n\t\t\t\t\t\"type\": \"query\"\n\t\t\t\t\t}\n         ]\n       }\n     ]\n   }\n]","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.addTaxonomy( config_id = \"id\",\n[\n\t{\n\t\t\"name\":\"My Taxonomy\",\n\t\t\"nodes\": [\n\t\t\t{\n\t\t\t\t\"name\":\"Sample Node\",\n\t\t\t\t\"topics\": [\n\t\t\t\t\t{\n\t\t\t\t\t\"id\": \"607ce795-291f-4dd4-8745-8039f5c40b72\",\n\t\t\t\t\t\"type\": \"query\"\n\t\t\t\t\t}\n         ]\n       }\n     ]\n   }\n]\n                    )"}]},"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"5627d8e03e0add0d00c9edb3","ref":"","required":false,"desc":"ID of config to create the taxonomy in","default":"","type":"string","name":"config_id","in":"body"}],"url":"/taxonomy.json"},"body":"","category":"577e4bf24159cd1900d5d2b7","createdAt":"2015-10-21T18:20:02.571Z","editedParams":true,"editedParams2":true,"excerpt":"Crete a new taxonomy in a given config. Each config can have only one taxonomy defined.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":67,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"create-taxonomy","sync_unique":"","title":"Create Taxonomy","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate Taxonomy

Crete a new taxonomy in a given config. Each config can have only one taxonomy defined.

Body JSON

config_id:
string
ID of config to create the taxonomy in

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d2fd","api":{"examples":{"codes":[{"language":"http","code":"[{\"id\":\"5a543ba8-cd7d-4af6-b69f-c99e0836de77\", \n\"topics\":[\n\t\t{\n\t\t\t\"id\":\"3fc7692e-996c-421a-a046-c8e2ff314e26\", \n\t\t\t\"type\":\"QUERY\"\n\t\t}]\n}]","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.updateTaxonomy( config_id = \"id\",\n[{\"id\":\"5a543ba8-cd7d-4af6-b69f-c99e0836de77\", \n\"topics\":[\n\t\t{\n\t\t\t\"id\":\"3fc7692e-996c-421a-a046-c8e2ff314e26\", \n\t\t\t\"type\":\"QUERY\"\n\t\t}]\n}]\n                       )"}]},"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"5627d95bfcbbc621004ec09d","ref":"","required":false,"desc":"ID of config with taxonomy to update","default":"","type":"string","name":"config_id","in":"body"}],"url":"/taxonomy.json"},"body":"","category":"577e4bf24159cd1900d5d2b7","createdAt":"2015-10-21T18:28:43.431Z","editedParams":true,"editedParams2":true,"excerpt":"To add nodes, or add topics to nodes, specify the relationship in the JSON you submit. You must refer to existing topics and nodes by their ID.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":68,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-taxonomy","sync_unique":"","title":"Update Taxonomy","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate Taxonomy

To add nodes, or add topics to nodes, specify the relationship in the JSON you submit. You must refer to existing topics and nodes by their ID.

Body JSON

config_id:
string
ID of config with taxonomy to update

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d2fe","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression = True)\nsession.deleteTaxonomy( config_id = \"id\")"}]},"results":{"codes":[{"status":202,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"56d9ba776fcdd00b0002cbe6","ref":"","required":false,"desc":"ID of config to delete from","default":"","type":"string","name":"config_id","in":"body"}],"url":"/taxonomy.json"},"body":"[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]","category":"577e4bf24159cd1900d5d2b7","createdAt":"2016-03-04T16:39:20.974Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":69,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-taxonomy","sync_unique":"","title":"Delete Taxonomy","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete Taxonomy


Body JSON

config_id:
string
ID of config to delete from
[block:code] { "codes": [ { "code": "", "language": "text" } ] } [/block]

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



[block:code] { "codes": [ { "code": "", "language": "text" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2b8","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d30f","createdAt":"2015-09-08T12:35:47.075Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":70,"body":"Sentiment phrases can be up to three words in length, and have a weight of -2 to 2. Sentiment phrases can contain query operators and use the same syntax as regular Semantria query categories. \n\nThe sentiment of the phrase can be negated (ex: not good) or intensified (ex: very good). Semantria output will list which phrases were negated and intensified along with the negator or intensifier.\n\nIn addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each phrase.","excerpt":"","slug":"sentiment-basics","type":"basic","title":"Sentiment Basics","__v":0,"childrenPages":[]}

Sentiment Basics


Sentiment phrases can be up to three words in length, and have a weight of -2 to 2. Sentiment phrases can contain query operators and use the same syntax as regular Semantria query categories. The sentiment of the phrase can be negated (ex: not good) or intensified (ex: very good). Semantria output will list which phrases were negated and intensified along with the negator or intensifier. In addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each phrase.
Sentiment phrases can be up to three words in length, and have a weight of -2 to 2. Sentiment phrases can contain query operators and use the same syntax as regular Semantria query categories. The sentiment of the phrase can be negated (ex: not good) or intensified (ex: very good). Semantria output will list which phrases were negated and intensified along with the negator or intensifier. In addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each phrase.
{"__v":0,"_id":"577e4bf34159cd1900d5d310","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.getPhrases( config_id=\"id\" )","name":""}]},"results":{"codes":[{"name":"","code":"[\n   {\n      \"id\": \"bd91d796-4ff2-41cf-9954-08c23a8ab239\", \n      \"modified\": 1450197726,\n      \"name\" : \"excellent service\",\n      \"weight\" : 0.8\n   }\n]","language":"json","status":200},{"name":"","code":"{}","language":"json","status":400}]},"settings":"","auth":"required","params":[{"_id":"55eed67e6ec7282b00e3025c","ref":"","required":false,"desc":"ID of configuration you wish to list sentiment phrases for","default":"","type":"string","name":"config_id","in":"query"}],"url":"/phrases.json"},"body":"","category":"577e4bf24159cd1900d5d2b8","createdAt":"2015-09-08T12:37:18.886Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":71,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-sentiment-phrases","sync_unique":"","title":"List sentiment phrases","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList sentiment phrases


Query Params

config_id:
string
ID of configuration you wish to list sentiment phrases for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d311","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.addPhrases(\n   {\n      \"name\" : \"excellent service\",\n      \"weight\" : 0.8\n   },\n  config_id = \"id\"\n)","name":""},{"language":"json","code":"[\n  {\n\t\t\t  \"name\": \"excellent service\", \n        \"weight\": 0.8\n    }\n]"}]},"results":{"codes":[{"status":202,"language":"json","code":"[\n    {\n        \"id\": \"b7150c19-ed58-44f7-9801-9f70d15e375a\", \n        \"modified\": 1450197978, \n        \"name\": \"excellent service\", \n        \"weight\": 0.8\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed67e6ec7282b00e3025c","ref":"","required":false,"desc":"ID of configuration you wish to list sentiment phrases for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/phrases.json"},"body":"","category":"577e4bf24159cd1900d5d2b8","createdAt":"2015-09-08T12:37:55.675Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":72,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"create-sentiment-phrases","sync_unique":"","title":"Create sentiment phrases","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate sentiment phrases


Body JSON

config_id:
string
ID of configuration you wish to list sentiment phrases for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d312","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.updatePhrases(\n   {\n      \"id\": \"b7150c19-ed58-44f7-9801-9f70d15e375a\",\n      \"name\" : \"excellent service\",\n      \"weight\" : 0.8\n   },\n  config_id = \"id\"\n)","name":""},{"language":"json","code":"[\n    {\n        \"id\": \"b7150c19-ed58-44f7-9801-9f70d15e375a\", \n        \"name\": \"excellent service\", \n        \"weight\": 0.8\n    }\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"id\": \"b7150c19-ed58-44f7-9801-9f70d15e375a\", \n        \"modified\": 1450197978, \n        \"name\": \"bad dates\", \n        \"weight\": -0.6\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed67e6ec7282b00e3025c","ref":"","required":false,"desc":"ID of configuration you wish to update sentiment phrases for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/phrases.json"},"body":"","category":"577e4bf24159cd1900d5d2b8","createdAt":"2015-09-08T12:38:28.993Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":73,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-sentiment-phrases","sync_unique":"","title":"Update sentiment phrases","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate sentiment phrases


Body JSON

config_id:
string
ID of configuration you wish to update sentiment phrases for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d313","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.removePhrases(\n\t[\n  \t\"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\"\n\t],\n  config_id = \"id\"\n)","name":""},{"language":"json","code":"[\n  \"85d14f37-cee6-4e28-9a1d-5bb698adcfd6\"\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed67e6ec7282b00e3025c","ref":"","required":false,"desc":"ID of configuration you wish to delete sentiment phrases for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/phrases.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b8","createdAt":"2015-09-08T12:40:33.708Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":74,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-sentiment-phrases","sync_unique":"","title":"Delete sentiment phrases","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete sentiment phrases


Body JSON

config_id:
string
ID of configuration you wish to delete sentiment phrases for

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"category":"577e4bf24159cd1900d5d2b9","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d31d","createdAt":"2015-09-08T12:44:11.406Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":75,"body":"You can create entities up to the limit specified by your subscription. Entities have four values - name, type, label and normalized value. The name of the entity can contain a Boolean query. If it contains a query, you must preface the query definition with a + sign.\n\nYou can use these outputs to create a simple entity taxonomy. For example:\n\nname: coke OR \"coca cola\"\nType: Soda\nLabel: Coke Products\nNormalized: Coca-Cola\n\nname: Fanta\nType: Soda\nLabel: Coke Products\nNormalized: Fanta\n\nname: Pepsi\nType: Soda\nLabel: Pepsi Products\nNormalized: Pepsi-Cola\n\nEntities also come with their own sentiment and themes. For longer content, this allows you to focus on the sentiment associated with individual brands instead of the document as a whole.\n\nIn addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each individual mention of an entity.","excerpt":"","slug":"entity-basics","type":"basic","title":"Entity Basics","__v":0,"childrenPages":[]}

Entity Basics


You can create entities up to the limit specified by your subscription. Entities have four values - name, type, label and normalized value. The name of the entity can contain a Boolean query. If it contains a query, you must preface the query definition with a + sign. You can use these outputs to create a simple entity taxonomy. For example: name: coke OR "coca cola" Type: Soda Label: Coke Products Normalized: Coca-Cola name: Fanta Type: Soda Label: Coke Products Normalized: Fanta name: Pepsi Type: Soda Label: Pepsi Products Normalized: Pepsi-Cola Entities also come with their own sentiment and themes. For longer content, this allows you to focus on the sentiment associated with individual brands instead of the document as a whole. In addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each individual mention of an entity.
You can create entities up to the limit specified by your subscription. Entities have four values - name, type, label and normalized value. The name of the entity can contain a Boolean query. If it contains a query, you must preface the query definition with a + sign. You can use these outputs to create a simple entity taxonomy. For example: name: coke OR "coca cola" Type: Soda Label: Coke Products Normalized: Coca-Cola name: Fanta Type: Soda Label: Coke Products Normalized: Fanta name: Pepsi Type: Soda Label: Pepsi Products Normalized: Pepsi-Cola Entities also come with their own sentiment and themes. For longer content, this allows you to focus on the sentiment associated with individual brands instead of the document as a whole. In addition, if you have mentions enabled in your configuration, you will receive the offset and byte length of each individual mention of an entity.
{"__v":0,"_id":"577e4bf34159cd1900d5d31e","api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret, use_compression=True )\nsession.getEntities( config_id = \"id\")","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n   {\n      \"id\": \"68d6046b-e94e-4a1d-9d69-005bfdaea5a7\",\n      \"modified\": 0,\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"normalized\" : \"chair\"\n   }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed86a6ec7282b00e30260","ref":"","required":false,"desc":"ID of configuration you wish to list entities for","default":"","type":"string","name":"config_id","in":"query"}],"url":"/entities.json"},"body":"","category":"577e4bf24159cd1900d5d2b9","createdAt":"2015-09-08T12:45:30.125Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":76,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-entities","sync_unique":"","title":"List Entities","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList Entities


Query Params

config_id:
string
ID of configuration you wish to list entities for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d31f","api":{"examples":{"codes":[{"name":"","code":"[\n   {\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"normalized\" : \"chair\"\n   }\n]","language":"json"},{"code":"import semantria\nsession = semantria.Session(key, secret)\nsession.addEntities(\n  config_id = \"id\",\n  [\n   {\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"normalized\" : \"chair\"\n   }\n]","language":"python"}]},"results":{"codes":[{"name":"","code":"[\n   {\n      \"id\": \"68d6046b-e94e-4a1d-9d69-005bfdaea5a7\",\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"modified\": 0,\n      \"normalized\" : \"chair\"\n   }\n]","language":"json","status":202},{"name":"","code":"{}","language":"json","status":400}]},"settings":"","auth":"required","params":[{"_id":"55eed86a6ec7282b00e30260","ref":"","required":false,"desc":"ID of configuration you wish to list entities for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/entities.json"},"body":"","category":"577e4bf24159cd1900d5d2b9","createdAt":"2015-09-08T12:46:14.012Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":77,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"create-entities","sync_unique":"","title":"Create Entities","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate Entities


Body JSON

config_id:
string
ID of configuration you wish to list entities for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d320","api":{"examples":{"codes":[{"language":"http","code":"https://api.semantria.com/entities.json?config_id=id\n[\n   {\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"normalized\" : \"chair\"\n   }\n]","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.updateEntities(\n  config_id = \"id\",\n  [\n   {\n      \"id\": \"68d6046b-e94e-4a1d-9d69-005bfdaea5a7\",\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",,\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"normalized\" : \"chair\"\n   }\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"  [\n   {\n      \"id\": \"68d6046b-e94e-4a1d-9d69-005bfdaea5a7\",\n      \"name\" : \"\\\"club chair\\\" OR \\\"task chair\\\" OR \\\"reclining chair\\\",,\n      \"type\" : \"furniture\",\n      \"label\" : \"http://en.wikipedia.org/wiki/Chair\",\n      \"modified\": 1450197417, \n      \"normalized\" : \"chair\"\n   }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed86a6ec7282b00e30260","ref":"","required":false,"desc":"ID of configuration you wish to list entities for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/entities.json"},"body":"","category":"577e4bf24159cd1900d5d2b9","createdAt":"2015-09-08T12:46:52.379Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":78,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-entities","sync_unique":"","title":"Update Entities","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate Entities


Body JSON

config_id:
string
ID of configuration you wish to list entities for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d321","api":{"examples":{"codes":[{"name":"","code":"[\n \"b09933f2-d274-4711-8ff5-2d994ce41c5a\"\n]","language":"json"},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.removeEntities(\n  config_id = \"id\",\n  [\n\t\t\"b09933f2-d274-4711-8ff5-2d994ce41c5a\"\n\t]\n)"}]},"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eed932b97ce63700d05912","ref":"","required":false,"desc":"","default":"","type":"string","name":"config_id","in":"body"}],"url":"/entities.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2b9","createdAt":"2015-09-08T12:48:50.690Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":79,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-entities","sync_unique":"","title":"Delete Entities","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete Entities


Body JSON

config_id:
string

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2ba","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d305","createdAt":"2015-09-08T12:57:04.063Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":80,"body":"You can add as many terms to your blacklist as specified in your configuration. Blacklist items are referred to by name and are configuration-specific.","excerpt":"","slug":"blacklist-basics","type":"basic","title":"Blacklist Basics","__v":0,"childrenPages":[]}

Blacklist Basics


You can add as many terms to your blacklist as specified in your configuration. Blacklist items are referred to by name and are configuration-specific.
You can add as many terms to your blacklist as specified in your configuration. Blacklist items are referred to by name and are configuration-specific.
{"__v":0,"_id":"577e4bf34159cd1900d5d306","api":{"examples":{"codes":[{"language":"http","code":"https://api.semantria.com/blacklist.json?config_id=\"id\"","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getBlacklist(\n  config_id = \"id\"\n)"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n    {\n        \"id\": \"2a7d539a-bf21-470f-b522-06c10fb3b9b6\", \n        \"modified\": 0, \n        \"name\": \"next quarter\"\n    }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eedb9e6af7743700e57ed5","ref":"","required":false,"desc":"Id of config you wish to see blacklist items for","default":"","type":"string","name":"config_id","in":"query"}],"url":"/blacklist.json"},"body":"","category":"577e4bf24159cd1900d5d2ba","createdAt":"2015-09-08T12:59:10.590Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":81,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"list-blacklist","sync_unique":"","title":"List Blacklist","type":"get","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getList Blacklist


Query Params

config_id:
string
Id of config you wish to see blacklist items for

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf34159cd1900d5d307","api":{"examples":{"codes":[{"language":"json","code":"[\n\t\"quarter\",\n  \"year\"\n]","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.addBlacklist(\n  config_id = \"id\",\n  [\n    \"quarter\",\n    \"year\"\n  ]\n)"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eedb9e6af7743700e57ed5","ref":"","required":false,"desc":"Id of config you wish to see blacklist items for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/blacklist.json"},"body":"","category":"577e4bf24159cd1900d5d2ba","createdAt":"2015-09-08T12:59:44.164Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":82,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"create-blacklist-item","sync_unique":"","title":"Create Blacklist item","type":"post","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

postCreate Blacklist item


Body JSON

config_id:
string
Id of config you wish to see blacklist items for

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"__v":0,"_id":"577e4bf34159cd1900d5d308","api":{"examples":{"codes":[{"language":"http","code":"https://api.semantria.com/blacklist.json&config_id=\"id\"\n[\n   \".*@.*com\",\n   \".*@com\\\\.net\",\n   \"http://www\\\\..*\\\\.com\"\n]","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.updateBlacklist(\n  config_id = \"id\",\n  [\n    \"chair\",\n    \"sofa\"\n]"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eedb9e6af7743700e57ed5","ref":"","required":false,"desc":"Id of config you wish to see blacklist items for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/blacklist.json"},"body":"","category":"577e4bf24159cd1900d5d2ba","createdAt":"2015-09-08T13:00:55.932Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":83,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"update-blacklist-item","sync_unique":"","title":"Update Blacklist item","type":"put","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

putUpdate Blacklist item


Body JSON

config_id:
string
Id of config you wish to see blacklist items for

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"__v":0,"_id":"577e4bf34159cd1900d5d309","api":{"examples":{"codes":[{"language":"http","code":"https://api.semantria.com/blacklist.json&config_id=\"id\"\n[\n   \".*@.*com\",\n   \".*@com\\\\.net\",\n   \"http://www\\\\..*\\\\.com\"\n]","name":""},{"code":"import semantria\nsession = semantria.Session(key, secret)\nsession.removeBlacklist(\n  config_id=\"id\",\n  [\n    \"chair\",\n    \"sofa\"\n\t]\n)","language":"python"}]},"results":{"codes":[{"status":200,"language":"json","code":"","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[{"_id":"55eedb9e6af7743700e57ed5","ref":"","required":false,"desc":"Id of config you wish to see blacklist items for","default":"","type":"string","name":"config_id","in":"body"}],"url":"/blacklist.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2ba","createdAt":"2015-09-08T13:01:49.929Z","editedParams":true,"editedParams2":true,"excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":84,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"delete-blacklist-item","sync_unique":"","title":"Delete Blacklist item","type":"delete","updates":[],"user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

deleteDelete Blacklist item


Body JSON

config_id:
string
Id of config you wish to see blacklist items for

Definition

{{ api_url }}{{ page_api_url }}

Examples



{"category":"577e4bf24159cd1900d5d2bb","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e1","createdAt":"2015-11-17T18:42:01.712Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":85,"body":"Semantria has graphical tools for looking at your account, such as the billing dashboard and SWEB. Those tools use these API endpoints.\n\nThe subscription endpoint returns the list of your account settings and limits as well as a list of all your configurations. This endpoint lists what features you have access to, not what features will be returned in the output. To see what features are enabled for output for a specific configuration, use the configuration endpoint.\n\nThe features endpoint returns all supported features for each language Semantria supports. Not all features might be enabled for your account, so this list might be different than what is returned by the subscription endpoint. \n\nThe statistics endpoint returns statistics about your account, such as the number of calls made, documents processed, and so on. Statistics also takes several parameters to allow you to specify the interval of time you wish to report on.\n\nThe status endpoint returns the current status of the API itself as well as version and languages.","excerpt":"","slug":"account-management-basics","type":"basic","title":"Account Management Basics","__v":0,"childrenPages":[]}

Account Management Basics


Semantria has graphical tools for looking at your account, such as the billing dashboard and SWEB. Those tools use these API endpoints. The subscription endpoint returns the list of your account settings and limits as well as a list of all your configurations. This endpoint lists what features you have access to, not what features will be returned in the output. To see what features are enabled for output for a specific configuration, use the configuration endpoint. The features endpoint returns all supported features for each language Semantria supports. Not all features might be enabled for your account, so this list might be different than what is returned by the subscription endpoint. The statistics endpoint returns statistics about your account, such as the number of calls made, documents processed, and so on. Statistics also takes several parameters to allow you to specify the interval of time you wish to report on. The status endpoint returns the current status of the API itself as well as version and languages.
Semantria has graphical tools for looking at your account, such as the billing dashboard and SWEB. Those tools use these API endpoints. The subscription endpoint returns the list of your account settings and limits as well as a list of all your configurations. This endpoint lists what features you have access to, not what features will be returned in the output. To see what features are enabled for output for a specific configuration, use the configuration endpoint. The features endpoint returns all supported features for each language Semantria supports. Not all features might be enabled for your account, so this list might be different than what is returned by the subscription endpoint. The statistics endpoint returns statistics about your account, such as the number of calls made, documents processed, and so on. Statistics also takes several parameters to allow you to specify the interval of time you wish to report on. The status endpoint returns the current status of the API itself as well as version and languages.
{"category":"577e4bf24159cd1900d5d2bb","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e2","createdAt":"2015-07-22T22:17:24.245Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":86,"body":"Note that our SDKs provide functions to take care of this for you. This is only to demonstrate what you need to do if you want to implement your own Semantria authentication code.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Authentication Algorithm\"\n}\n[/block]\n1. Obtain User Key and Secret (we emailed it to you when you registered with Semantria. Check your junk folder if you can't find it!)\n2. Generate signature base string (with URL)\n3. Generate OAuth signature for URL\n  * Encode signature base string with UTF-8 encoding. Keep encoded symbols in upper case.\n  * Calculate MD5 using secret key and use it in lower case.\n  * Convert signature base string and MD5 hash code of the secret into byte representations.\n  * Encrypt bytes of signature base string with MD5 hash code using HMAC-SHA1 algorithm.\n  * Convert result back into string form using Base64 algorithm and UTF-8 encoding.\n  * Encode string with URL encoding algorithm and write as oauth_signature parameter.\n4. Create Authorization Header as shown below. Note the single Authorization HTTP header with parameters separated by commas.\n5. Combine URL and header into the request and use it for authorization for the Semantria API.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Authorization\"\n}\n[/block]\nAuthorization determines whether users are allowed to do certain actions.\n\nThe Semantria API authorization model allows organized access to the API after the user is authenticated. After the user passes authentication, the Semantria authorization model allows or denies access to the user based on the user's subscription limits, account balance, account limits, configuration limits, and account expiration. If you're experiencing issues, please contact support-- they will be happy to help you out.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Authorization Header\"\n}\n[/block]\nThe authentication mechanism requires a signature base string. The base string is a combination of the complete URL pertaining to the shared endpoint and certain parameters: *oauth_consumer_key*, a public key; *oauth_nonce*, a random 64-bit unsigned ASCII decimal string; *oauth_signature_method*, any one-way algorithm to hash URL (e.g. HMAC-SHA1); *oauth_timestamp*, the current time stamp in numeric form; and *oauth_version*, OAuth version 1.0. Add these parameters to any request URL for the Semantria API.\n\n**The target URL for retrieving the document status with signature parameters will look like this:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"https://api.semantria.com/document/Q24FT98RWX45.json?oauth_consumer_key=b36eab90ec7dcd8d&\\noauth_nonce=8e9a56a4c2cf47f&\\noauth_signature_method= HMAC-SHA1&\\noauth_timestamp=1272323042&\\noauth_version=1.0\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nSignature base strings will be encoded with an HMAC-SHA1 hash algorithm using the MD5 checksum of the secret key. The generated hash function should be URL encoded and added as an “oauth_signature” parameter for the authorization header.\n\nOnly the license-holder and Semantria know the MD5 checksum of the secret key. Both parties can generate hash functions from the signature base string. The server runs the same process and compares the results with the hash function generated by the server's secret key. The server identifies clients with an “oauth_consumer_key” parameter in the header.\n\nThe URL encoding function must be in UTF-8 upper case hexadecimal format.\n\n**The complete authorization header will look like this:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"https://api.semantria.com/subscription.json?oauth_consumer_key=XXXXX&\\noauth_nonce=3931596951957366614&\\noauth_signature_method=HMAC-SHA1&\\noauth_timestamp=1320143435&\\noauth_version=1.0\\n\\nAuthorization: OAuth realm=““,\\n   oauth_consumer_key=“XXXXX”,\\n   oauth_nonce=“3931596951957366614”,\\n   oauth_signature=“PSILFVnqdp8Nl8PpwtF%2fxeguuAQ%3d”,\\n   oauth_signature_method=“HMAC-SHA1”,\\n   oauth_timestamp=“1320143435”,\\n   oauth_version=“1.0”“\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]","excerpt":"Authentication answers, \"Are you a registered user?\" through the following algorithm.","slug":"authentication","type":"basic","title":"Authentication","__v":0,"childrenPages":[]}

Authentication

Authentication answers, "Are you a registered user?" through the following algorithm.

Note that our SDKs provide functions to take care of this for you. This is only to demonstrate what you need to do if you want to implement your own Semantria authentication code. [block:api-header] { "type": "basic", "title": "Authentication Algorithm" } [/block] 1. Obtain User Key and Secret (we emailed it to you when you registered with Semantria. Check your junk folder if you can't find it!) 2. Generate signature base string (with URL) 3. Generate OAuth signature for URL * Encode signature base string with UTF-8 encoding. Keep encoded symbols in upper case. * Calculate MD5 using secret key and use it in lower case. * Convert signature base string and MD5 hash code of the secret into byte representations. * Encrypt bytes of signature base string with MD5 hash code using HMAC-SHA1 algorithm. * Convert result back into string form using Base64 algorithm and UTF-8 encoding. * Encode string with URL encoding algorithm and write as oauth_signature parameter. 4. Create Authorization Header as shown below. Note the single Authorization HTTP header with parameters separated by commas. 5. Combine URL and header into the request and use it for authorization for the Semantria API. [block:api-header] { "type": "basic", "title": "Authorization" } [/block] Authorization determines whether users are allowed to do certain actions. The Semantria API authorization model allows organized access to the API after the user is authenticated. After the user passes authentication, the Semantria authorization model allows or denies access to the user based on the user's subscription limits, account balance, account limits, configuration limits, and account expiration. If you're experiencing issues, please contact support-- they will be happy to help you out. [block:api-header] { "type": "basic", "title": "Authorization Header" } [/block] The authentication mechanism requires a signature base string. The base string is a combination of the complete URL pertaining to the shared endpoint and certain parameters: *oauth_consumer_key*, a public key; *oauth_nonce*, a random 64-bit unsigned ASCII decimal string; *oauth_signature_method*, any one-way algorithm to hash URL (e.g. HMAC-SHA1); *oauth_timestamp*, the current time stamp in numeric form; and *oauth_version*, OAuth version 1.0. Add these parameters to any request URL for the Semantria API. **The target URL for retrieving the document status with signature parameters will look like this:** [block:code] { "codes": [ { "code": "https://api.semantria.com/document/Q24FT98RWX45.json?oauth_consumer_key=b36eab90ec7dcd8d&\noauth_nonce=8e9a56a4c2cf47f&\noauth_signature_method= HMAC-SHA1&\noauth_timestamp=1272323042&\noauth_version=1.0", "language": "text" } ] } [/block] Signature base strings will be encoded with an HMAC-SHA1 hash algorithm using the MD5 checksum of the secret key. The generated hash function should be URL encoded and added as an “oauth_signature” parameter for the authorization header. Only the license-holder and Semantria know the MD5 checksum of the secret key. Both parties can generate hash functions from the signature base string. The server runs the same process and compares the results with the hash function generated by the server's secret key. The server identifies clients with an “oauth_consumer_key” parameter in the header. The URL encoding function must be in UTF-8 upper case hexadecimal format. **The complete authorization header will look like this:** [block:code] { "codes": [ { "code": "https://api.semantria.com/subscription.json?oauth_consumer_key=XXXXX&\noauth_nonce=3931596951957366614&\noauth_signature_method=HMAC-SHA1&\noauth_timestamp=1320143435&\noauth_version=1.0\n\nAuthorization: OAuth realm=““,\n oauth_consumer_key=“XXXXX”,\n oauth_nonce=“3931596951957366614”,\n oauth_signature=“PSILFVnqdp8Nl8PpwtF%2fxeguuAQ%3d”,\n oauth_signature_method=“HMAC-SHA1”,\n oauth_timestamp=“1320143435”,\n oauth_version=“1.0”“", "language": "text" } ] } [/block]
Note that our SDKs provide functions to take care of this for you. This is only to demonstrate what you need to do if you want to implement your own Semantria authentication code. [block:api-header] { "type": "basic", "title": "Authentication Algorithm" } [/block] 1. Obtain User Key and Secret (we emailed it to you when you registered with Semantria. Check your junk folder if you can't find it!) 2. Generate signature base string (with URL) 3. Generate OAuth signature for URL * Encode signature base string with UTF-8 encoding. Keep encoded symbols in upper case. * Calculate MD5 using secret key and use it in lower case. * Convert signature base string and MD5 hash code of the secret into byte representations. * Encrypt bytes of signature base string with MD5 hash code using HMAC-SHA1 algorithm. * Convert result back into string form using Base64 algorithm and UTF-8 encoding. * Encode string with URL encoding algorithm and write as oauth_signature parameter. 4. Create Authorization Header as shown below. Note the single Authorization HTTP header with parameters separated by commas. 5. Combine URL and header into the request and use it for authorization for the Semantria API. [block:api-header] { "type": "basic", "title": "Authorization" } [/block] Authorization determines whether users are allowed to do certain actions. The Semantria API authorization model allows organized access to the API after the user is authenticated. After the user passes authentication, the Semantria authorization model allows or denies access to the user based on the user's subscription limits, account balance, account limits, configuration limits, and account expiration. If you're experiencing issues, please contact support-- they will be happy to help you out. [block:api-header] { "type": "basic", "title": "Authorization Header" } [/block] The authentication mechanism requires a signature base string. The base string is a combination of the complete URL pertaining to the shared endpoint and certain parameters: *oauth_consumer_key*, a public key; *oauth_nonce*, a random 64-bit unsigned ASCII decimal string; *oauth_signature_method*, any one-way algorithm to hash URL (e.g. HMAC-SHA1); *oauth_timestamp*, the current time stamp in numeric form; and *oauth_version*, OAuth version 1.0. Add these parameters to any request URL for the Semantria API. **The target URL for retrieving the document status with signature parameters will look like this:** [block:code] { "codes": [ { "code": "https://api.semantria.com/document/Q24FT98RWX45.json?oauth_consumer_key=b36eab90ec7dcd8d&\noauth_nonce=8e9a56a4c2cf47f&\noauth_signature_method= HMAC-SHA1&\noauth_timestamp=1272323042&\noauth_version=1.0", "language": "text" } ] } [/block] Signature base strings will be encoded with an HMAC-SHA1 hash algorithm using the MD5 checksum of the secret key. The generated hash function should be URL encoded and added as an “oauth_signature” parameter for the authorization header. Only the license-holder and Semantria know the MD5 checksum of the secret key. Both parties can generate hash functions from the signature base string. The server runs the same process and compares the results with the hash function generated by the server's secret key. The server identifies clients with an “oauth_consumer_key” parameter in the header. The URL encoding function must be in UTF-8 upper case hexadecimal format. **The complete authorization header will look like this:** [block:code] { "codes": [ { "code": "https://api.semantria.com/subscription.json?oauth_consumer_key=XXXXX&\noauth_nonce=3931596951957366614&\noauth_signature_method=HMAC-SHA1&\noauth_timestamp=1320143435&\noauth_version=1.0\n\nAuthorization: OAuth realm=““,\n oauth_consumer_key=“XXXXX”,\n oauth_nonce=“3931596951957366614”,\n oauth_signature=“PSILFVnqdp8Nl8PpwtF%2fxeguuAQ%3d”,\n oauth_signature_method=“HMAC-SHA1”,\n oauth_timestamp=“1320143435”,\n oauth_version=“1.0”“", "language": "text" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2bb","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e3","createdAt":"2015-07-07T21:28:41.243Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getSubscription()","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"{\n  \"name\": \"tim.mohler@lexalytics.com\",\n  \"status\": \"active\",\n  \"billing_settings\": {\n    \"data_calls_limit\": 10,\n    \"settings_calls_limit\": 10,\n    \"polling_calls_limit\": 10,\n    \"data_calls_limit_interval\": 1,\n    \"settings_calls_limit_interval\": 1,\n    \"polling_calls_limit_interval\": 1,\n    \"docs_balance\": 797674,\n    \"settings_calls_balance\": 9,\n    \"polling_calls_balance\": 10,\n    \"data_calls_balance\": 10,\n    \"expiration_date\": 1456358400000,\n    \"limit_type\": \"metered\",\n    \"docs_suggested\" : 0,\n    \"docs_suggsted_interval\" : 0,\n    \"job_ids_allocated\": 0, \n    \"job_ids_permitted\": 10,\n    \"app_seats_permitted\": 10,\n    \"app_seats_allocated\": 8\n  },\n  \"basic_settings\": {\n    \"collection_limit\": 1000,\n    \"auto_response_batch_limit\": 2,\n    \"configurations_limit\": 100,\n    \"concept_topics_limit\": 100,\n    \"query_topics_limit\": 1000,\n    \"user_entities_limit\": 1000,\n    \"callback_batch_limit\": 100,\n    \"concept_topic_samples_limit\": 20,\n    \"return_source_text\": false,\n    \"characters_limit\": 1024000,\n    \"blacklist_limit\": 100,\n    \"sentiment_phrases_limit\": 1000,\n    \"incoming_batch_limit\": 100,\n    \"document_length\" : 2048,\n    \"polling_batch_limit\": 100,\n    \"summary_size_limit\": 100\n  },\n  \"feature_settings\": {\n    \"document\": {\n      \"concept_topics\": true,\n      \"query_topics\": true,\n      \"sentiment_phrases\": true,\n      \"user_entities\": true,\n      \"intentions\": true,\n      \"model_sentiment\": true,\n      \"language_detection\": true,\n      \"named_entities\": true,\n      \"pos_tagging\": true,\n      \"summary\": true,\n      \"themes\": true,\n      \"relations\": true,\n      \"mentions\": true,\n      \"opinions\": true,\n      \"auto_categories\": true\n    },\n    \"collection\": {\n      \"concept_topics\": true,\n      \"query_topics\": true,\n      \"user_entities\": true,\n      \"named_entities\": true,\n      \"themes\": true,\n      \"mentions\": false,\n      \"facets\": true\n    },\n    \"html_processing\": true,\n    \"supported_languages\": \"Chinese, English, French, German, Portuguese, Spanish, Italian, Korean, Arabic, Russian, Malay, Japanese, Dutch, Swedish, Norwegian, Danish\"\n  }\n \"templates\": [\n            {\n                \"config_id\": \"0581e02fb27066ac973c182545693f3e\", \n                \"id\": \"def_template_id_9438637610992\", \n                \"is_free\": true, \n                \"language\": \"English\", \n                \"name\": \"default_template_9438637610992\", \n                \"type\": \"language-default\",\n  \t            \"version\" : 1\n            }, \n            {\n                \"config_id\": \"625d64573c164bccad4631aa6349ac2d\", \n                \"id\": \"default-es\", \n                \"is_free\": true, \n                \"language\": \"es\", \n                \"name\": \"Spanish samples\", \n                \"type\": \"language-default\", \n                \"version\": \"1\"\n            }, \n            {\n                \"config_id\": \"4182936a494de53e3c13ce1feab70fdc\", \n                \"id\": \"default-fr\", \n                \"is_free\": true, \n                \"language\": \"fr\", \n                \"name\": \"French samples\", \n                \"type\": \"language-default\", \n                \"version\": \"1\"\n            }, \n            {\n                \"config_id\": \"7d265f1cda6fdb90d6ad7c72a873e890\", \n                \"id\": \"default-en\", \n                \"is_free\": true, \n                \"language\": \"en\", \n                \"name\": \"English samples\", \n                \"type\": \"language-default\", \n                \"version\": \"1\"\n            }, \n            {\n                \"config_id\": \"cba6daae76d64cc1658593f22fa0555b\", \n                \"id\": \"vp-en-hotel-1\", \n                \"is_free\": false, \n                \"language\": \"en\", \n                \"name\": \"Hotel\", \n                \"type\": \"vertical-pack\", \n                \"version\": \"1\"\n            }\n        ]\n    }, \n}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/subscription.[ json | xml ]"},"isReference":false,"order":87,"body":"","excerpt":"The subscription endpoint allows you to see what your subscription is entitled to use in Semantria.","slug":"subscription","type":"get","title":"Subscription","__v":0,"childrenPages":[]}

getSubscription

The subscription endpoint allows you to see what your subscription is entitled to use in Semantria.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"577e4bf24159cd1900d5d2e4","api":{"examples":{"codes":[{"language":"text","code":"GET https://api.semantria.com:443/features.json?language=en","name":""},{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getSupportedFeatures()"}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n  {\n\t\"id\": \"en\",\n\t\"language\": \"English\",\n\t\"html_processing\": true,\n\t\"settings\": {\n\t  \"blacklist\": true,\n\t  \"user_entities\": true,\n\t  \"sentiment_phrases\": true,\n\t  \"concept_topics\": true,\n\t  \"query_topics\": true\n\t},\n\t\"detailed_mode\": {\n\t  \"language_detection\": true,\n\t  \"pos_tagging\": true,\n\t  \"intentions\": true,\n\t  \"mentions\": true,\n\t  \"sentiment_phrases\": true,\n\t  \"themes\": true,\n\t  \"relations\": true,\n\t  \"named_entities\": true,\n\t  \"sentiment\": true,\n\t  \"summarization\": true,\n\t  \"user_entities\": true,\n\t  \"query_topics\": true,\n\t  \"auto_categories\": true,\n\t  \"concept_topics\": true,\n\t  \"opinions\": true\n\t},\n\t\"discovery_mode\": {\n\t  \"named_entities\": true,\n\t  \"mentions\": true,\n\t  \"facets\": true,\n\t  \"user_entities\": true,\n\t  \"concept_topics\": true,\n\t  \"themes\": true,\n\t  \"query_topics\": true,\n\t  \"attributes\": true\n\t}\n  }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":200,"language":"xml","code":"<supported_features>\n  <features>\n\t<detailed_mode>\n\t  <language_detection>true</language_detection>\n\t  <pos_tagging>true</pos_tagging>\n\t  <intentions>true</intentions>\n\t  <theme_mentions>true</theme_mentions>\n\t  <sentiment_phrases>true</sentiment_phrases>\n\t  <entity_themes>true</entity_themes>\n\t  <themes>true</themes>\n\t  <entity_relations>true</entity_relations>\n\t  <named_entities>true</named_entities>\n\t  <sentiment>true</sentiment>\n\t  <entity_mentions>true</entity_mentions>\n\t  <summarization>true</summarization>\n\t  <user_entities>true</user_entities>\n\t  <queries>true</queries>\n\t  <auto_categories>true</auto_categories>\n\t  <user_categories>true</user_categories>\n\t  <entity_opinions>true</entity_opinions>\n\t</detailed_mode>\n\t<discovery_mode>\n\t  <named_entities>true</named_entities>\n\t  <entity_mentions>true</entity_mentions>\n\t  <facet_mentioins>true</facet_mentioins>\n\t  <facets>true</facets>\n\t  <user_entities>true</user_entities>\n\t  <theme_mentions>true</theme_mentions>\n\t  <user_categories>true</user_categories>\n\t  <themes>true</themes>\n\t  <queries>true</queries>\n\t  <facet_attributes>true</facet_attributes>\n\t</discovery_mode>\n\t<html_processing>true</html_processing>\n\t<id>en</id>\n\t<language>English</language>\n\t<settings>\n\t  <blacklist>true</blacklist>\n\t  <user_entities>true</user_entities>\n\t  <sentiment_phrases>true</sentiment_phrases>\n\t  <user_categories>true</user_categories>\n\t  <queries>true</queries>\n\t</settings>\n  </features>\n</supported_features>"}]},"settings":"","auth":"required","params":[{"_id":"55a5540618dc630d0005ddb2","ref":"","required":false,"desc":"The language parameter is entered as an ISO language code. It is optional and may be skipped. If no parameter is passed, Semantria will respond with a list of supported features, organized by language.","default":"","type":"array_string","name":"language","in":"query"}],"url":"https://api.semantria.com:443/features.json?"},"body":"","category":"577e4bf24159cd1900d5d2bb","createdAt":"2015-07-07T21:28:48.722Z","editedParams":true,"editedParams2":true,"excerpt":"This method returns a list of the supported features per languages supported by the Semantria API. The language parameter is optional and if not included, the endpoint will return features for every supported language to which you have access.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":88,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"features","sync_unique":"","title":"Features","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getFeatures

This method returns a list of the supported features per languages supported by the Semantria API. The language parameter is optional and if not included, the endpoint will return features for every supported language to which you have access.

Query Params

language:
array of strings
The language parameter is entered as an ISO language code. It is optional and may be skipped. If no parameter is passed, Semantria will respond with a list of supported features, organized by language.

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":4,"_id":"577e4bf24159cd1900d5d2e5","api":{"auth":"required","examples":{"codes":[{"name":"","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getStatistics(config_id = \"id\", interval=\"week\")","language":"python"},{"code":"#Statistics for entire account for a date range\nhttps://api.semantria.com/statistics.json?from=2016-01-01T00:00:00Z&to=2016-01-25T00:00:00Z\n#Statistics for entire account for one day\nhttps://api.semantria.com/statistics.json?interval=day\n#Statistics for a single configuration for one day\nhttps://api.semantria.com/statistics.json?interval=day&config=9f12650909332cf0389e27854d05bd75\n#Statistics for entire account for one day, grouped by config, app, language per hour\nhttps://api.semantria.com/statistics.json?interval=day&group=config,app,language,1h","language":"http"}]},"params":[{"_id":"55eedd10b97ce63700d05920","default":"Hour","desc":"(Required if no using from/to) Hour, Day, Week, Month, Year values are supported.","name":"interval","ref":"","required":true,"type":"string","in":"query"},{"_id":"577e4da32bcb6b0e00e9f818","default":"","desc":"(Required if not using interval) Start time for statistics, provided in UNIX epoch or ISO date format","name":"from","ref":"","required":true,"type":"string","in":"query"},{"_id":"577e4da32bcb6b0e00e9f817","default":"","desc":"Ending time rate in UNIX epoch or ISO date format","name":"to","ref":"","required":true,"type":"string","in":"query"},{"_id":"55eedd10b97ce63700d05921","ref":"","required":false,"desc":"Optional config to limit output, specified by ID","default":"","type":"string","name":"config_id","in":"query"},{"_id":"577e58b947a9ab0e003e00ee","ref":"","required":false,"desc":"Optional config to limit output, specified by name","default":"","type":"string","name":"config_name","in":"query"},{"_id":"577e4c7a6172c720001285af","ref":"","required":false,"desc":"config, language, app, user, time intervals in ‘(0-9){1,2}(m|h|d) format","default":"","type":"string","name":"group","in":"query"},{"_id":"577e58b947a9ab0e003e00ed","ref":"","required":false,"desc":"Optional, limits output to just the user_id specified","default":"","type":"string","name":"user_id","in":"query"},{"_id":"577e58b947a9ab0e003e00ec","ref":"","required":false,"desc":"Optional, limits output to just the user email specified","default":"","type":"string","name":"user_email","in":"query"},{"_id":"577e58b947a9ab0e003e00eb","ref":"","required":false,"desc":"Optional, limits output to just the app specified (Excel, API, etc)","default":"","type":"string","name":"app","in":"query"}],"results":{"codes":[{"name":"","code":"#Simple example from statistics?interval=week\n[\n    {\n#Number of batches queued in the time range\n        \"batches_queued\": 24,\n#Number of data calls in the time range\n        \"calls_data\": 24,\n#Number of polling calls in the time range\n        \"calls_polling\": 208,\n#Number of settings calls in the time range\n        \"calls_settings\": 6,\n#Name of the account\n        \"consumer_name\": \"test.user@lexalytics.com\",\n#Number of docs that failed in the interval\n        \"docs_failed\": 296,\n#Number of docs queued in the interval\n        \"docs_queued\": 2080,\n#Number of docs retreived in the interval\n        \"docs_retrieved\": 2080,\n#Number of docs that succeeded in the interval\n        \"docs_successful\": 1784,\n#Last application used to access Semantria\n        \"latest_used_app\": \"Python/3.8.77/JSON\",\n#Total number of API calls over the interval\n        \"total_api_calls\": 238\n    }\n]","language":"json","status":200},{"name":"","code":"{}","language":"json","status":400},{"status":200,"language":"json","code":"#https://api.semantria/com/statistics?interval=day&group=config,6h\n[\n#One structure for each grouped element, in this case config\n  {\n        \"config_id\": \"01b71c5e66eddef4132fa1c8e29a6327\",\n        \"config_name\": \"chinese\",\n        \"consumer_name\": \"test.user@lexalytics.com\",\n        \"values\": [\n#One entry for each 6 hour range in the day\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n#Start time of the 6 hour interval\n                \"time\": 1467849600000,\n                \"total_api_calls\": 0\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467871200000,\n                \"total_api_calls\": 0\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 1,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467892800000,\n                \"total_api_calls\": 1\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467914400000,\n                \"total_api_calls\": 0\n            }\n        ]\n    },\n    {\n        \"config_id\": \"41672c786059f8f189ed850894063108\",\n        \"config_name\": \"portuguese model\",\n        \"consumer_name\": \"test.user@lexalytics.com\",\n        \"values\": [\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467849600000,\n                \"total_api_calls\": 0\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467871200000,\n                \"total_api_calls\": 0\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 1,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467892800000,\n                \"total_api_calls\": 1\n            },\n            {\n                \"batches_queued\": 0,\n                \"calls_data\": 0,\n                \"calls_polling\": 0,\n                \"calls_settings\": 0,\n                \"docs_failed\": 0,\n                \"docs_queued\": 0,\n                \"docs_retrieved\": 0,\n                \"docs_successful\": 0,\n                \"time\": 1467914400000,\n                \"total_api_calls\": 0\n            }\n        ]\n    }\n  ]"}]},"settings":"","url":"/statistics.json"},"body":"","category":"577e4bf24159cd1900d5d2bb","createdAt":"2015-07-07T21:28:57.294Z","editedParams":true,"editedParams2":true,"excerpt":"Returns interval usage statistics over a date range, filtered and grouped by various options. All requests need to specify the date range with either the \"interval\" parameter or the \"from\" and \"to\" parameters. \n\nFrom and to parameters take either UNIX epoch time or ISO formats. Interval specifies the current internal. For an interval of \"day\" for example, the current day will be used, and for an interval of \"month\" use the current month. Interval and From/To are exclusive.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":89,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"statistics","sync_unique":"","title":"Statistics","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getStatistics

Returns interval usage statistics over a date range, filtered and grouped by various options. All requests need to specify the date range with either the "interval" parameter or the "from" and "to" parameters. From and to parameters take either UNIX epoch time or ISO formats. Interval specifies the current internal. For an interval of "day" for example, the current day will be used, and for an interval of "month" use the current month. Interval and From/To are exclusive.

Query Params

interval:
required
stringHour
(Required if no using from/to) Hour, Day, Week, Month, Year values are supported.
from:
required
string
(Required if not using interval) Start time for statistics, provided in UNIX epoch or ISO date format
to:
required
string
Ending time rate in UNIX epoch or ISO date format
config_id:
string
Optional config to limit output, specified by ID
config_name:
string
Optional config to limit output, specified by name
group:
string
config, language, app, user, time intervals in ‘(0-9){1,2}(m|h|d) format
user_id:
string
Optional, limits output to just the user_id specified
user_email:
string
Optional, limits output to just the user email specified
app:
string
Optional, limits output to just the app specified (Excel, API, etc)

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2bb","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2e6","createdAt":"2015-11-17T18:26:10.200Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"examples":{"codes":[{"language":"python","code":"import semantria\nsession = semantria.Session(key, secret)\nsession.getStatus()","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"[\n {\n   'api_version':  3.9,\n        'service_status':  'available',\n        'service_version':  '3.9.2',\n        'supported_compression':  'gzip,deflate',\n        'supported_encoding':  'UTF-8',\n        'supported_languages':  ['English', 'French', 'Spanish', 'Portuguese', 'German',         'Chinese', 'Italian', 'Korean', 'Japanese', 'Malay', 'Arabic Premium', 'Arabic', 'Russian Premium', 'Russian', 'Dutch', 'Swedish', 'Norwegian', 'Danish', 'Turkish Premium', 'Polish Premium']\n }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":"/status.json"},"isReference":false,"order":90,"body":"","excerpt":"","slug":"api-status","type":"get","title":"API Status","__v":0,"childrenPages":[]}

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2c9","createdAt":"2015-07-07T21:42:01.988Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":91,"body":"A configuration in Semantria is a combination of language, API settings and NLP tuning. It represents a way you want documents to be processed. This means you can have different configurations for different industry verticals (\"sick\" is not a sentiment bearing word in drug research for instance), or for different types of documents (tweets can be treated differently than news documents).\n\nEach configuration is identified by a unique ID assigned by Semantria at creation time. When you first sign up for Semantria, we create a number of configurations by default with some examples of what you can have in a configuration such as queries, categories and so on.\n\nOne configuration in the account is the primary configuration. You can change which configuration is primary, but you can never delete the one that is the current primary configuration. If you send documents for processing without specifying a configuration ID, they go to the primary account.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Language\"\n}\n[/block]\nThe most important configuration setting is language. Each configuration can have only one language specified, and that language cannot be changed once the configuration is created. This is because not all settings and features are supported for every language. You must determine the language of the documents you send to a configuration. Although Semantria can detect languages, we do not route documents based on language, we merely send back to you what language we thought the document was. \n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"One Sentence Mode\"\n}\n[/block]\nThe next most important setting in a configuration is one_sentence mode. When a configuration is in one_sentence mode, it adapts to the language commonly used in very short pieces of content such as tweets, Instagram updates, and many other types of status updates. In these types of content, punctuation and capitalization is often missing and there is common use of acronyms, emoji, and other types of shorthand. One_sentence mode is designed to deal with these issues. We recommend not turning one_sentence mode on for content longer than 3 sentences. \nNote: One_sentence mode is not available for Tier 2 languages.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Processing Settings\"\n}\n[/block]\nSemantria supports multiple ways of interacting with the API. More information is available about this in the Integration Scenarios section. One important thing to note is that if you plan to use Excel with this configuration, it must be in polling mode, not auto-response or callback.\n\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Thresholds\"\n}\n[/block]\nMost of the other settings in a configuration are thresholds. These control how many of a particular output is returned, or how confident we have to be in a match before we return it to you. If you set unwanted outputs to zero, your documents will process a little bit faster and you won't have to parse the unwanted output.","excerpt":"","slug":"why-configurations","type":"basic","title":"Configurations","__v":0,"childrenPages":[]}

Configurations


A configuration in Semantria is a combination of language, API settings and NLP tuning. It represents a way you want documents to be processed. This means you can have different configurations for different industry verticals ("sick" is not a sentiment bearing word in drug research for instance), or for different types of documents (tweets can be treated differently than news documents). Each configuration is identified by a unique ID assigned by Semantria at creation time. When you first sign up for Semantria, we create a number of configurations by default with some examples of what you can have in a configuration such as queries, categories and so on. One configuration in the account is the primary configuration. You can change which configuration is primary, but you can never delete the one that is the current primary configuration. If you send documents for processing without specifying a configuration ID, they go to the primary account. [block:api-header] { "type": "basic", "title": "Language" } [/block] The most important configuration setting is language. Each configuration can have only one language specified, and that language cannot be changed once the configuration is created. This is because not all settings and features are supported for every language. You must determine the language of the documents you send to a configuration. Although Semantria can detect languages, we do not route documents based on language, we merely send back to you what language we thought the document was. [block:api-header] { "type": "basic", "title": "One Sentence Mode" } [/block] The next most important setting in a configuration is one_sentence mode. When a configuration is in one_sentence mode, it adapts to the language commonly used in very short pieces of content such as tweets, Instagram updates, and many other types of status updates. In these types of content, punctuation and capitalization is often missing and there is common use of acronyms, emoji, and other types of shorthand. One_sentence mode is designed to deal with these issues. We recommend not turning one_sentence mode on for content longer than 3 sentences. Note: One_sentence mode is not available for Tier 2 languages. [block:api-header] { "type": "basic", "title": "Processing Settings" } [/block] Semantria supports multiple ways of interacting with the API. More information is available about this in the Integration Scenarios section. One important thing to note is that if you plan to use Excel with this configuration, it must be in polling mode, not auto-response or callback. [block:api-header] { "type": "basic", "title": "Thresholds" } [/block] Most of the other settings in a configuration are thresholds. These control how many of a particular output is returned, or how confident we have to be in a match before we return it to you. If you set unwanted outputs to zero, your documents will process a little bit faster and you won't have to parse the unwanted output.
A configuration in Semantria is a combination of language, API settings and NLP tuning. It represents a way you want documents to be processed. This means you can have different configurations for different industry verticals ("sick" is not a sentiment bearing word in drug research for instance), or for different types of documents (tweets can be treated differently than news documents). Each configuration is identified by a unique ID assigned by Semantria at creation time. When you first sign up for Semantria, we create a number of configurations by default with some examples of what you can have in a configuration such as queries, categories and so on. One configuration in the account is the primary configuration. You can change which configuration is primary, but you can never delete the one that is the current primary configuration. If you send documents for processing without specifying a configuration ID, they go to the primary account. [block:api-header] { "type": "basic", "title": "Language" } [/block] The most important configuration setting is language. Each configuration can have only one language specified, and that language cannot be changed once the configuration is created. This is because not all settings and features are supported for every language. You must determine the language of the documents you send to a configuration. Although Semantria can detect languages, we do not route documents based on language, we merely send back to you what language we thought the document was. [block:api-header] { "type": "basic", "title": "One Sentence Mode" } [/block] The next most important setting in a configuration is one_sentence mode. When a configuration is in one_sentence mode, it adapts to the language commonly used in very short pieces of content such as tweets, Instagram updates, and many other types of status updates. In these types of content, punctuation and capitalization is often missing and there is common use of acronyms, emoji, and other types of shorthand. One_sentence mode is designed to deal with these issues. We recommend not turning one_sentence mode on for content longer than 3 sentences. Note: One_sentence mode is not available for Tier 2 languages. [block:api-header] { "type": "basic", "title": "Processing Settings" } [/block] Semantria supports multiple ways of interacting with the API. More information is available about this in the Integration Scenarios section. One important thing to note is that if you plan to use Excel with this configuration, it must be in polling mode, not auto-response or callback. [block:api-header] { "type": "basic", "title": "Thresholds" } [/block] Most of the other settings in a configuration are thresholds. These control how many of a particular output is returned, or how confident we have to be in a match before we return it to you. If you set unwanted outputs to zero, your documents will process a little bit faster and you won't have to parse the unwanted output.
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ca","createdAt":"2016-01-06T18:14:47.835Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":92,"body":"Industry Packs are sets of industry-specific NLP tuning. Sentiment phrases, entities, queries and intentions specific to that industry are included in an Industry Pack. Using an industry pack for the appropriate content will increase the accuracy of Semantria by more the ten percentage points. If your subscription entitles you, you can create a new configuration based off of a pack.\n\nEach pack is represented as a template. You can see the available packs via the /templates.json endpoint. To create a new configuration based on a pack via the API, you clone from the template, by passing in the template ID of the pack you wish to use to the /configurations.json endpoint. Alternatively, you can specify the pack you wish to base your configuration on in the SWEB creation wizard. You can watch a video on how to create an industry pack based configuration in our SWEB video [series](https://www.youtube.com/playlist?list=PLmIHux1QQeKsJ1DDql54MJURPzAUQWdUs\n)\n\nOnce a configuration is created, you will see entities and queries that are specific to the industry. For instance, in the Airlines pack, you will see the names of major airlines, airports, industry alliances, loyalty programs, and jet models in the entities section. Since at this point it is just another configuration, you can modify them as you see fit. Note that when Lexalytics updates an industry pack, the version of the template will change, but the changes will not flow down to any configurations you created with the previous version. \n\nAlthough sentiment is tuned for the industry, you will not see any phrases listed in the phrase section of the configuration. The number of phrases added or removed is quite large for each pack and listing multiple thousands of phrases would not be feasible. \n\nIntentions are tuned for each pack, but the intentions are not user-modifiable and thus the modifications are not listed for intentions.\n\nThe current list of industry packs is here: https://www.lexalytics.com/technology/industry-packs","excerpt":"","slug":"industry-packs","type":"basic","title":"Industry Packs","__v":0,"childrenPages":[]}

Industry Packs


Industry Packs are sets of industry-specific NLP tuning. Sentiment phrases, entities, queries and intentions specific to that industry are included in an Industry Pack. Using an industry pack for the appropriate content will increase the accuracy of Semantria by more the ten percentage points. If your subscription entitles you, you can create a new configuration based off of a pack. Each pack is represented as a template. You can see the available packs via the /templates.json endpoint. To create a new configuration based on a pack via the API, you clone from the template, by passing in the template ID of the pack you wish to use to the /configurations.json endpoint. Alternatively, you can specify the pack you wish to base your configuration on in the SWEB creation wizard. You can watch a video on how to create an industry pack based configuration in our SWEB video [series](https://www.youtube.com/playlist?list=PLmIHux1QQeKsJ1DDql54MJURPzAUQWdUs ) Once a configuration is created, you will see entities and queries that are specific to the industry. For instance, in the Airlines pack, you will see the names of major airlines, airports, industry alliances, loyalty programs, and jet models in the entities section. Since at this point it is just another configuration, you can modify them as you see fit. Note that when Lexalytics updates an industry pack, the version of the template will change, but the changes will not flow down to any configurations you created with the previous version. Although sentiment is tuned for the industry, you will not see any phrases listed in the phrase section of the configuration. The number of phrases added or removed is quite large for each pack and listing multiple thousands of phrases would not be feasible. Intentions are tuned for each pack, but the intentions are not user-modifiable and thus the modifications are not listed for intentions. The current list of industry packs is here: https://www.lexalytics.com/technology/industry-packs
Industry Packs are sets of industry-specific NLP tuning. Sentiment phrases, entities, queries and intentions specific to that industry are included in an Industry Pack. Using an industry pack for the appropriate content will increase the accuracy of Semantria by more the ten percentage points. If your subscription entitles you, you can create a new configuration based off of a pack. Each pack is represented as a template. You can see the available packs via the /templates.json endpoint. To create a new configuration based on a pack via the API, you clone from the template, by passing in the template ID of the pack you wish to use to the /configurations.json endpoint. Alternatively, you can specify the pack you wish to base your configuration on in the SWEB creation wizard. You can watch a video on how to create an industry pack based configuration in our SWEB video [series](https://www.youtube.com/playlist?list=PLmIHux1QQeKsJ1DDql54MJURPzAUQWdUs ) Once a configuration is created, you will see entities and queries that are specific to the industry. For instance, in the Airlines pack, you will see the names of major airlines, airports, industry alliances, loyalty programs, and jet models in the entities section. Since at this point it is just another configuration, you can modify them as you see fit. Note that when Lexalytics updates an industry pack, the version of the template will change, but the changes will not flow down to any configurations you created with the previous version. Although sentiment is tuned for the industry, you will not see any phrases listed in the phrase section of the configuration. The number of phrases added or removed is quite large for each pack and listing multiple thousands of phrases would not be feasible. Intentions are tuned for each pack, but the intentions are not user-modifiable and thus the modifications are not listed for intentions. The current list of industry packs is here: https://www.lexalytics.com/technology/industry-packs
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2cb","createdAt":"2015-09-17T20:48:21.471Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":93,"body":"An example of a JSON object for setting the values is below. Each value has a comment for its type and default value.\n\nThe mandatory values for creating a configuration are:\n\nname\nlanguage\nis_primary\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"POST https://api.semantria.com/configurations.json\\n[\\n   {\\n#ID of configuration for which you want to set values. \\n#Type string.\\n#No default.\\n      \\\"config_id\\\" : \\\"\\\",\\n#Name of configuration.\\n#Type string\\n#No default\\n      \\\"name\\\" : \\\"New test configuration\\\",\\n#Whether this should be the primary configuration in your account\\n#Type Boolean\\n#Default false\\n      \\\"is_primary\\\" : true,\\n#Whether to use the auto-response method or not. \\n#Type Boolean\\n#Defaults to false\\n      \\\"auto_response\\\" : false,\\n#Which language you are going to process with this config\\n#Type string\\n#No default\\n      \\\"language\\\" : \\\"English\\\",\\n#Percentage of content that has to be alpha-numeric for processing to succeed. Documents that do not meet this threshold will be returned to you with a FAILED status.\\n#Type integer\\n#Default 80\\n      \\\"alphanumeric_threshold\\\" : 80,\\n#Confidence level a category match must have to be returned for a document\\n#Type double\\n#Default 0.45\\n      \\\"categories_threshold\\\" : 0.45,\\n#Confidence level an entity match must have to be returned for a document\\n#Type double\\n#Default 0.55\\n\\t  \\t\\\"entities_threshold\\\" : 0,\\n#Template id this config was created from, if any\\n#Type string\\n#Default none\\n\\t\\t\\t\\\"from_template_config_id\\\": \\\"cba6daae76d64cc1658593f22fa0555b\\\",\\n#Whether to treat a document as a single sentence and ignore capitalization\\n#Type Boolean\\n#Default false\\n      \\\"one_sentence_mode\\\" : false,\\n#Whether to treat a document as well-formed HTML and extract text fields\\n#Type Boolean\\n#Default false\\n      \\\"process_html\\\" : false,\\n#What URL Semantria should POST data to when processed. Only set this if you are using the callback data retrieval method.\\n#Type string\\n#Default null\\n      \\\"callback\\\" : \\\"https://anyapi.anydomain.com/processed.json\\\",\\n      \\\"document\\\" : {\\n#Whether to retrieve intentions for a document or not.\\n#Type: Boolean\\n#Default false\\n         \\\"intentions\\\" : false,\\n#Which Parts of Speech to return\\n#Type String\\n#Default Null\\n         \\\"pos_types\\\" : \\\"Noun,Verb,Adjective\\\",\\n#Return sentiment phrases\\n#Type Boolean\\n#Default true\\n         \\\"sentiment_phrases\\\" : true,\\n#Return Auto Categories\\n#Type Boolean\\n#Default true\\n         \\\"auto_categories\\\" : true,\\n#Return Concept Topics\\n#Type Boolean\\n#Default false\\n         \\\"concept_topics\\\" : false,\\n#Return Query Topics\\n#Type Boolean\\n#Default true\\n         \\\"query_topics\\\" : true,\\n#Return Named Entities (automatically discovered by Semantria)\\n#Type Boolean\\n#Default true\\n         \\\"named_entities\\\" : true,\\n#Return User Entities (defined by the user)\\n#Type Boolean\\n#Default true\\n         \\\"user_entities\\\" : true,\\n#Return themes\\n#Type Boolean\\n#Default true\\n         \\\"themes\\\" : true,\\n#Return individual mentions of entities, themes and queries\\n#Type Boolean\\n#Default false\\n         \\\"mentions\\\" : false,\\n#Return relations\\n#Type Boolean\\n#Default false\\n         \\\"relations\\\" : false,\\n#Return opinions\\n#Type Boolean\\n#Default false\\n         \\\"opinions\\\" : false,\\n#Length of summary in sentences to return per document\\n#Type Integer\\n#Default 3\\n         \\\"summary_size\\\" : 3,\\n#Whether to detect the language of the document. Documents not in the same language as the config will be returned to you with a FAILED status.\\n#Type Boolean\\n#Default false\\n         \\\"detect_language\\\" : true\\n      },\\n      \\\"collection\\\" : {\\n#Return facets per collection\\n#Type Boolean\\n#Default true\\n         \\\"facets\\\" : true,\\n#Return attributes per collection\\n#Type Boolean\\n#Default true\\n         \\\"attributes\\\" : true,\\n#Number of individual mentions to return per collection\\n#Type Boolean\\n#Default false\\n         \\\"mentions\\\" : false,\\n#Return concept topics per collection\\n#Type Boolean\\n#Default true\\n         \\\"concept_topics\\\" : true,\\n#Return queries per collection\\n#Type Boolean\\n#Default true\\n         \\\"query_topics\\\" : true,\\n#Return entities per collection\\n#Type Boolean\\n#Default true\\n         \\\"named_entities\\\" : true,\\n#Return themes per collection\\n#Type Boolean\\n#Default true\\n         \\\"themes\\\" : true,\\n#Return user-defined entities per collection\\n#Type Boolean\\n#Default true\\n\\t\\t\\t\\t \\\"user_entitities\\\" : true\\n      }\\n   }\\n]\",\n      \"language\": \"json\"\n    }\n  ]\n}\n[/block]","excerpt":"A list of all settable values associated with a Semantria configuration","slug":"configuration-values","type":"basic","title":"Configuration values","__v":0,"childrenPages":[]}

Configuration values

A list of all settable values associated with a Semantria configuration

An example of a JSON object for setting the values is below. Each value has a comment for its type and default value. The mandatory values for creating a configuration are: name language is_primary [block:code] { "codes": [ { "code": "POST https://api.semantria.com/configurations.json\n[\n {\n#ID of configuration for which you want to set values. \n#Type string.\n#No default.\n \"config_id\" : \"\",\n#Name of configuration.\n#Type string\n#No default\n \"name\" : \"New test configuration\",\n#Whether this should be the primary configuration in your account\n#Type Boolean\n#Default false\n \"is_primary\" : true,\n#Whether to use the auto-response method or not. \n#Type Boolean\n#Defaults to false\n \"auto_response\" : false,\n#Which language you are going to process with this config\n#Type string\n#No default\n \"language\" : \"English\",\n#Percentage of content that has to be alpha-numeric for processing to succeed. Documents that do not meet this threshold will be returned to you with a FAILED status.\n#Type integer\n#Default 80\n \"alphanumeric_threshold\" : 80,\n#Confidence level a category match must have to be returned for a document\n#Type double\n#Default 0.45\n \"categories_threshold\" : 0.45,\n#Confidence level an entity match must have to be returned for a document\n#Type double\n#Default 0.55\n\t \t\"entities_threshold\" : 0,\n#Template id this config was created from, if any\n#Type string\n#Default none\n\t\t\t\"from_template_config_id\": \"cba6daae76d64cc1658593f22fa0555b\",\n#Whether to treat a document as a single sentence and ignore capitalization\n#Type Boolean\n#Default false\n \"one_sentence_mode\" : false,\n#Whether to treat a document as well-formed HTML and extract text fields\n#Type Boolean\n#Default false\n \"process_html\" : false,\n#What URL Semantria should POST data to when processed. Only set this if you are using the callback data retrieval method.\n#Type string\n#Default null\n \"callback\" : \"https://anyapi.anydomain.com/processed.json\",\n \"document\" : {\n#Whether to retrieve intentions for a document or not.\n#Type: Boolean\n#Default false\n \"intentions\" : false,\n#Which Parts of Speech to return\n#Type String\n#Default Null\n \"pos_types\" : \"Noun,Verb,Adjective\",\n#Return sentiment phrases\n#Type Boolean\n#Default true\n \"sentiment_phrases\" : true,\n#Return Auto Categories\n#Type Boolean\n#Default true\n \"auto_categories\" : true,\n#Return Concept Topics\n#Type Boolean\n#Default false\n \"concept_topics\" : false,\n#Return Query Topics\n#Type Boolean\n#Default true\n \"query_topics\" : true,\n#Return Named Entities (automatically discovered by Semantria)\n#Type Boolean\n#Default true\n \"named_entities\" : true,\n#Return User Entities (defined by the user)\n#Type Boolean\n#Default true\n \"user_entities\" : true,\n#Return themes\n#Type Boolean\n#Default true\n \"themes\" : true,\n#Return individual mentions of entities, themes and queries\n#Type Boolean\n#Default false\n \"mentions\" : false,\n#Return relations\n#Type Boolean\n#Default false\n \"relations\" : false,\n#Return opinions\n#Type Boolean\n#Default false\n \"opinions\" : false,\n#Length of summary in sentences to return per document\n#Type Integer\n#Default 3\n \"summary_size\" : 3,\n#Whether to detect the language of the document. Documents not in the same language as the config will be returned to you with a FAILED status.\n#Type Boolean\n#Default false\n \"detect_language\" : true\n },\n \"collection\" : {\n#Return facets per collection\n#Type Boolean\n#Default true\n \"facets\" : true,\n#Return attributes per collection\n#Type Boolean\n#Default true\n \"attributes\" : true,\n#Number of individual mentions to return per collection\n#Type Boolean\n#Default false\n \"mentions\" : false,\n#Return concept topics per collection\n#Type Boolean\n#Default true\n \"concept_topics\" : true,\n#Return queries per collection\n#Type Boolean\n#Default true\n \"query_topics\" : true,\n#Return entities per collection\n#Type Boolean\n#Default true\n \"named_entities\" : true,\n#Return themes per collection\n#Type Boolean\n#Default true\n \"themes\" : true,\n#Return user-defined entities per collection\n#Type Boolean\n#Default true\n\t\t\t\t \"user_entitities\" : true\n }\n }\n]", "language": "json" } ] } [/block]
An example of a JSON object for setting the values is below. Each value has a comment for its type and default value. The mandatory values for creating a configuration are: name language is_primary [block:code] { "codes": [ { "code": "POST https://api.semantria.com/configurations.json\n[\n {\n#ID of configuration for which you want to set values. \n#Type string.\n#No default.\n \"config_id\" : \"\",\n#Name of configuration.\n#Type string\n#No default\n \"name\" : \"New test configuration\",\n#Whether this should be the primary configuration in your account\n#Type Boolean\n#Default false\n \"is_primary\" : true,\n#Whether to use the auto-response method or not. \n#Type Boolean\n#Defaults to false\n \"auto_response\" : false,\n#Which language you are going to process with this config\n#Type string\n#No default\n \"language\" : \"English\",\n#Percentage of content that has to be alpha-numeric for processing to succeed. Documents that do not meet this threshold will be returned to you with a FAILED status.\n#Type integer\n#Default 80\n \"alphanumeric_threshold\" : 80,\n#Confidence level a category match must have to be returned for a document\n#Type double\n#Default 0.45\n \"categories_threshold\" : 0.45,\n#Confidence level an entity match must have to be returned for a document\n#Type double\n#Default 0.55\n\t \t\"entities_threshold\" : 0,\n#Template id this config was created from, if any\n#Type string\n#Default none\n\t\t\t\"from_template_config_id\": \"cba6daae76d64cc1658593f22fa0555b\",\n#Whether to treat a document as a single sentence and ignore capitalization\n#Type Boolean\n#Default false\n \"one_sentence_mode\" : false,\n#Whether to treat a document as well-formed HTML and extract text fields\n#Type Boolean\n#Default false\n \"process_html\" : false,\n#What URL Semantria should POST data to when processed. Only set this if you are using the callback data retrieval method.\n#Type string\n#Default null\n \"callback\" : \"https://anyapi.anydomain.com/processed.json\",\n \"document\" : {\n#Whether to retrieve intentions for a document or not.\n#Type: Boolean\n#Default false\n \"intentions\" : false,\n#Which Parts of Speech to return\n#Type String\n#Default Null\n \"pos_types\" : \"Noun,Verb,Adjective\",\n#Return sentiment phrases\n#Type Boolean\n#Default true\n \"sentiment_phrases\" : true,\n#Return Auto Categories\n#Type Boolean\n#Default true\n \"auto_categories\" : true,\n#Return Concept Topics\n#Type Boolean\n#Default false\n \"concept_topics\" : false,\n#Return Query Topics\n#Type Boolean\n#Default true\n \"query_topics\" : true,\n#Return Named Entities (automatically discovered by Semantria)\n#Type Boolean\n#Default true\n \"named_entities\" : true,\n#Return User Entities (defined by the user)\n#Type Boolean\n#Default true\n \"user_entities\" : true,\n#Return themes\n#Type Boolean\n#Default true\n \"themes\" : true,\n#Return individual mentions of entities, themes and queries\n#Type Boolean\n#Default false\n \"mentions\" : false,\n#Return relations\n#Type Boolean\n#Default false\n \"relations\" : false,\n#Return opinions\n#Type Boolean\n#Default false\n \"opinions\" : false,\n#Length of summary in sentences to return per document\n#Type Integer\n#Default 3\n \"summary_size\" : 3,\n#Whether to detect the language of the document. Documents not in the same language as the config will be returned to you with a FAILED status.\n#Type Boolean\n#Default false\n \"detect_language\" : true\n },\n \"collection\" : {\n#Return facets per collection\n#Type Boolean\n#Default true\n \"facets\" : true,\n#Return attributes per collection\n#Type Boolean\n#Default true\n \"attributes\" : true,\n#Number of individual mentions to return per collection\n#Type Boolean\n#Default false\n \"mentions\" : false,\n#Return concept topics per collection\n#Type Boolean\n#Default true\n \"concept_topics\" : true,\n#Return queries per collection\n#Type Boolean\n#Default true\n \"query_topics\" : true,\n#Return entities per collection\n#Type Boolean\n#Default true\n \"named_entities\" : true,\n#Return themes per collection\n#Type Boolean\n#Default true\n \"themes\" : true,\n#Return user-defined entities per collection\n#Type Boolean\n#Default true\n\t\t\t\t \"user_entitities\" : true\n }\n }\n]", "language": "json" } ] } [/block]
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2cc","createdAt":"2015-07-07T21:44:29.398Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":94,"body":"Semantria scores sentiment based on a pre-configured dictionary of phrases that are broadly applicable to many domains. However, each domain also has specific phrases that differ from the broad usage. You can increase sentiment accuracy by editing your configuration. \n\nSentiment phrases can be from one to three words long or be a Boolean query. The longest phrase will win if there are sub-phrases found. For instance, the word \"crude\" is scored as a negative out of the box, but the phrase \"crude oil\" is not sentiment bearing. Since \"crude oil\" is configured as neutral and is longer than \"crude\" when we see the phrase \"crude oil\" it will not be given sentiment.\n\nPart of speech plays a role. We don't want to assign sentiment to proper nouns generally - \"love\" may be positive, but not \"Courtney Love\". Thus, not only does the phrase have to exist in the text, it has to match the proper parts of speech.\n\nPhrases also obey negators and intensifiers, such as not and very. Because of this, you should usually not enter a negated phrase such as \"not good.\" Enter the phrase that carries the sentiment (good) and let the NLP engine figure out the negation.\n\nWhen you add sentiment phrases to a configuration, you can give them a score of -2 to +2. We recommend keeping the scores within -1 to +1, but for particularly strong words, you can exceed that.\n\nWhen tuning, pay attention to the frequency of the words in your data set and focus on the most frequently occurring ones. Also, think about alternative uses that might not be sentiment bearing, especially with single words. For instance, \"garbage\" might seem on the surface to be always negative, but \"Taking out the garbage\" is likely not sentiment bearing, and certainly \"garbage truck\" or \"garbage collectors\" are not. Below are some examples.\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Tonality\",\n    \"h-1\": \"Weight\",\n    \"h-2\": \"Phrase\",\n    \"0-0\": \"Completely positive\",\n    \"0-1\": \"1\",\n    \"0-2\": \"perfect\",\n    \"1-0\": \"Mostly positive\",\n    \"1-1\": \"0.6\",\n    \"1-2\": \"great\",\n    \"2-0\": \"Somewhat positive\",\n    \"2-1\": \"0.3\",\n    \"2-2\": \"good\",\n    \"3-0\": \"Somewhat negative\",\n    \"3-1\": \"-0.3\",\n    \"3-2\": \"small NEAR/3 screen\",\n    \"4-0\": \"Mostly negative\",\n    \"4-1\": \"-0.6\",\n    \"4-2\": \"\\\"always breaking\\\"\",\n    \"5-0\": \"Completely negative\",\n    \"5-1\": \"-1\",\n    \"5-2\": \"\\\"worst experience ever\\\"\"\n  },\n  \"cols\": 3,\n  \"rows\": 6\n}\n[/block]","excerpt":"Sentiment phrases are the easiest way to adjust the sentiment output.","slug":"sentiment-bearing-phrases","type":"basic","title":"Sentiment phrases","__v":0,"childrenPages":[]}

Sentiment phrases

Sentiment phrases are the easiest way to adjust the sentiment output.

Semantria scores sentiment based on a pre-configured dictionary of phrases that are broadly applicable to many domains. However, each domain also has specific phrases that differ from the broad usage. You can increase sentiment accuracy by editing your configuration. Sentiment phrases can be from one to three words long or be a Boolean query. The longest phrase will win if there are sub-phrases found. For instance, the word "crude" is scored as a negative out of the box, but the phrase "crude oil" is not sentiment bearing. Since "crude oil" is configured as neutral and is longer than "crude" when we see the phrase "crude oil" it will not be given sentiment. Part of speech plays a role. We don't want to assign sentiment to proper nouns generally - "love" may be positive, but not "Courtney Love". Thus, not only does the phrase have to exist in the text, it has to match the proper parts of speech. Phrases also obey negators and intensifiers, such as not and very. Because of this, you should usually not enter a negated phrase such as "not good." Enter the phrase that carries the sentiment (good) and let the NLP engine figure out the negation. When you add sentiment phrases to a configuration, you can give them a score of -2 to +2. We recommend keeping the scores within -1 to +1, but for particularly strong words, you can exceed that. When tuning, pay attention to the frequency of the words in your data set and focus on the most frequently occurring ones. Also, think about alternative uses that might not be sentiment bearing, especially with single words. For instance, "garbage" might seem on the surface to be always negative, but "Taking out the garbage" is likely not sentiment bearing, and certainly "garbage truck" or "garbage collectors" are not. Below are some examples. [block:parameters] { "data": { "h-0": "Tonality", "h-1": "Weight", "h-2": "Phrase", "0-0": "Completely positive", "0-1": "1", "0-2": "perfect", "1-0": "Mostly positive", "1-1": "0.6", "1-2": "great", "2-0": "Somewhat positive", "2-1": "0.3", "2-2": "good", "3-0": "Somewhat negative", "3-1": "-0.3", "3-2": "small NEAR/3 screen", "4-0": "Mostly negative", "4-1": "-0.6", "4-2": "\"always breaking\"", "5-0": "Completely negative", "5-1": "-1", "5-2": "\"worst experience ever\"" }, "cols": 3, "rows": 6 } [/block]
Semantria scores sentiment based on a pre-configured dictionary of phrases that are broadly applicable to many domains. However, each domain also has specific phrases that differ from the broad usage. You can increase sentiment accuracy by editing your configuration. Sentiment phrases can be from one to three words long or be a Boolean query. The longest phrase will win if there are sub-phrases found. For instance, the word "crude" is scored as a negative out of the box, but the phrase "crude oil" is not sentiment bearing. Since "crude oil" is configured as neutral and is longer than "crude" when we see the phrase "crude oil" it will not be given sentiment. Part of speech plays a role. We don't want to assign sentiment to proper nouns generally - "love" may be positive, but not "Courtney Love". Thus, not only does the phrase have to exist in the text, it has to match the proper parts of speech. Phrases also obey negators and intensifiers, such as not and very. Because of this, you should usually not enter a negated phrase such as "not good." Enter the phrase that carries the sentiment (good) and let the NLP engine figure out the negation. When you add sentiment phrases to a configuration, you can give them a score of -2 to +2. We recommend keeping the scores within -1 to +1, but for particularly strong words, you can exceed that. When tuning, pay attention to the frequency of the words in your data set and focus on the most frequently occurring ones. Also, think about alternative uses that might not be sentiment bearing, especially with single words. For instance, "garbage" might seem on the surface to be always negative, but "Taking out the garbage" is likely not sentiment bearing, and certainly "garbage truck" or "garbage collectors" are not. Below are some examples. [block:parameters] { "data": { "h-0": "Tonality", "h-1": "Weight", "h-2": "Phrase", "0-0": "Completely positive", "0-1": "1", "0-2": "perfect", "1-0": "Mostly positive", "1-1": "0.6", "1-2": "great", "2-0": "Somewhat positive", "2-1": "0.3", "2-2": "good", "3-0": "Somewhat negative", "3-1": "-0.3", "3-2": "small NEAR/3 screen", "4-0": "Mostly negative", "4-1": "-0.6", "4-2": "\"always breaking\"", "5-0": "Completely negative", "5-1": "-1", "5-2": "\"worst experience ever\"" }, "cols": 3, "rows": 6 } [/block]
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2cd","createdAt":"2015-07-07T21:43:46.139Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":95,"body":"## General Guidelines\n\n* Default max length: 1,500 characters (but may be changed - contact us)\n* Query cannot be empty\n* Operators must be CAPITALIZED\n* NEAR accepts values from 1 to 99 (e.g. NEAR/3)\n* Operators are always surrounded by words (e.g. coffee AND tea AND decaf)\n* Double-check opening and closing quotes and parentheses\n* Query terms cannot contain special characters\n* Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | \" ' : ; . , < > ? / 1 2 3 4 5 6 7 8 9 0\n* Spaces are special characters\n* Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. \"#beautifulflowers\", \"customer service\", \"123 Ave Rosemont\", \"rendez-vous\")\n* Queries can contain stopwords, but they must be enclosed in quotes\n\n## Operators\nNote that operators **must** be capitalized, otherwise they will be treated as a query term. Query operators must also be preceded and followed by query terms or query phrases.\n\n## OR operator\nInside a query, the OR operator may be used to retrieve documents containing either of two terms.\n\n**Example:**\n*onions OR cheese* will detect \"Onions make my eyes water\", \"My favorite cheese is cheddar\", and \"I want cheese and onions on my pizza\".\n\n## AND operator\nInside a query, the AND operator may be used to retrieve documents containing both specified terms.\n\n**Example:**\n*onions AND cheese* will detect \"I want cheese and onions on my pizza\" or \" I like cheese on my onion rings\", but not \"Onions make my eyes water\" or \"My favorite cheese is cheddar.\"\n\n## NEAR operator\nA NEAR operator is effectively an AND operator where you can control the distance between the words. *onions NEAR cheese* means that the term cheese must exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* must exist within 50 words of *cheese*. This window can be between 1 and 99.\n\n**Other examples include:**\n*(onions OR bananas) NEAR/5 (cheese OR dinner)* would tag \"The banana split was included with dinner\" and \"The steak dinner with onions was my favorite.\" This query will **not** detect sentences like \"The cheese platter on the dinner menu was superb\" or \"Bananas, strawberries, and ice cream are not a balanced dinner.\"\n\n*(onions NEAR/5 cheese)* would tag a comment like \"Do you want onions on top of your cheese?\" but not \"Their cheese is my favorite but only on the dish with carmelized onions.\"\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"Do not use the NEAR operator in the following fashions:\",\n  \"body\": \"*\\\"onions NEAR/10 cheese\\\"* – this does nothing\\n*onions \\\"NEAR/10\\\" cheese* – this does nothing\"\n}\n[/block]\n## NOTNEAR operator\nA NOTNEAR operator is effectively a NOT operator where you can control the distance between the words. *onions NOTNEAR cheese* means that the term cheese cannot exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NOTNEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* cannot exist within 50 words of *cheese*. This window can be between 1 and 99.\n\n## WITH operator\nA WITH operator requires that the two terms occur within the same sentence. As such, it is the same as a NEAR operator, with the exception that the match window between the two terms is not specified.\n\n*\"onions WITH cheese\"* means that the term cheese must exist within the same sentence as onions.​\n\n## NOTWITH operator\nA NOTWITH operator requires that the two terms cannot occur within the same sentence. As such, it is the same as a NOTNEAR operator, with the exception that the match window between the two terms is not specified.\n\n*\"onions NOTWITH cheese\"* means that the term cheese cannot exist within the same sentence as onions.​\n\n\n## NOT operator\nThe NOT operator excludes any documents containing the term which follows it. *onions NOT celery* will return all uses of onion, excluding those that contain \"celery.\" A query must contain at least one non-excluded term when using the NOT operator.\n\n**Example**\n*onions NOT celery* will detect \"I like onions very much\" but not \"I like onions on my sandwich and celery on the side.\"\n\n## EXCLUDE operator\nTwo query terms of any type may be joined by an EXCLUDE operator, e.g. *York EXCLUDE \"New York\"*. The effect is different than that of the NOT operator. The query will return documents with the word \"York\", excluding those that only contain occurrences of \"New York\".\n\n**Consider the following sample text:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"I spent the day in York, visiting the magnificent cathedral. Then it was time to head back to London for my flight home to New York.\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\n**This text would generate the following results for the provided queries:**\n*York NOT \"New York\"*: FALSE\n*York EXCLUDE \"New York\"*: TRUE\n\n## Parentheses\nQueries can use parentheses to control the logic of the query and they may appear in any combination.\n\n**Two examples of queries with smart uses of parentheses are:**\n*((onions OR cheese) AND celery) NOT horrible\n(onions OR cheese) NEAR (horrible OR disgusting)*\n\nEvery left parenthesis must have a corresponding right parenthesis. Queries can have nested parentheses up to 10 levels deep.\n\n## Queries\n\n##Terms and Phrases\nSingle query terms are the simplest query element, consisting of a single word. \n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"A query term can be an operator or a word that appears in a stopword list **only** if it is in quotations.\",\n  \"title\": \"Query term from stopword list\"\n}\n[/block]\n A query term cannot contain punctuation or other special characters like `! @ # $ % ^ ( ) _ = ~ + [ ] { } ( ) | \" ' : ; . , < > ? / -\n\nPhrases must be enclosed in double quotes. When a single word is enclosed in quotes, it is not treated as a phrase search: it is treated like a single word.\n\n## Wildcards\nA wildcard character (&#42;) may be used at the end of a single word query term or within a phrase. It allows the system to tag all spellings of the word starting with the letters before the wildcard (&#42;). Wildcards will only work in phrases if they are attached to the last term in the phrase. \n\n**For example:**\n*excit&#42;* would match excite, exciting, excitement, etc.\n\"*running fast&#42;*\" would match \"running fast\" and \"running faster\".\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"There must be at least a three-letter prefix to a wildcard query. d&#42;, do&#42;, and dog&#42;M are all invalid. Queries like *\\\"&#42;\\\"* and *Commonwealth AND *\\\"&#42;\\\"* are invalid and achieve nothing.\"\n}\n[/block]\n## Nested Queries\nReferencing a query is done by placing an asterisk (&#42;) that the beginning of a query name and wrapping the asterisk and the query name in parentheses \"( )\". It signals to the system to look for a query and use it in another query. For example, consider the following queries:\n\n**Dirty** *dirty OR filth&#42; OR disgust&#42; OR nasty*\n**Bathroom** *bathroom&#42; OR toilet&#42; OR restroom&#42; OR lavatory&#42;*\n**Restaurant_Interior** *restaurant OR table&#42; OR chair&#42; OR carpet&#42; OR furniture&#42; OR plate&#42; OR cup&#42;*\n\nTwo queries can be combined to create a nested query. \n\n**For example:**\n**Dirty Bathroom** *(&#42;Dirty) AND (&#42;Bathroom)*\n\nQuery names being nested cannot contain spaces. Only the AND and OR operators function with nested queries.\n\n## Case Sensitivity\nBy default, query terms are handled in a case-insensitive manner. Case-sensitivity on a query term can be enforced using the ~ operator. *~Google NEAR/10 Microsoft* will hit for the phrase \"Both tech giants Microsoft and Google are investing heavily in mobile technologies\" as well as the phrase \"who wins in search, microsoft, bing or google?\"\n\n## Stemming\nBy default, query terms are stemmed. For phrase searches, only the right-most word is stemmed. The query process will not stem all words within the multi-word phrase. (e.g. \"driving on faster roads\" will match *\"driving on faster road&#42;\"* but will not match *\"driving on fast&#42; roads\"*). Special characters may be used within query phrases if they are in quotations.\n\n**Correct Query:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"Gepp OR Gunther OR Hasso OR \\\"Hayden-Smith\\\" OR Hirakubo OR Kanai OR Mathis OR Moeller OR \\\"Nijssen_Smith\\\" OR Sherman OR Shimizu OR \\\"U'Ren\\\" OR Daiji\",\n      \"language\": \"text\",\n      \"name\": \"Correct Query: \"\n    }\n  ]\n}\n[/block]\n**Wrong Query:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"Gepp OR Gunther OR Hasso OR Hayden-Smith OR Hirakubo OR Kanai OR Mathis OR Moeller OR Nijssen_Smith OR Sherman OR Shimizu OR U'Ren OR Daiji\",\n      \"language\": \"text\",\n      \"name\": \"Wrong Query: \"\n    }\n  ]\n}\n[/block]\n## Accents\nIf a query term is written without accents, the term will match text that has accents. For instance, if your query term is gate, you will also match the text gâté.\n\nIf you have accents in your query terms, then only the exact form will be match. For instance, if your query term is gâté, you will not match gate.\n\n## Scores\nQuery results will be accompanied with two scores, Query Relevancy and Query Sentiment.\n\n## Query Relevancy\nQuery Relevancy is a count of the query terms found within a document. It can be particularly effective in determining the effectiveness of your queries based on your text. \n\n**Consider the following text:** \n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"I have one cat and I used to have a dog too.\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nThe query relevancy score for the query* cat OR dog OR bird *will be 2 because the query detects two of the query terms.\n\n## Query Sentiment\nQuery Sentiment is the sentiment for each query It is calculated by finding the query hits, finding sentiment terms near the hits, and averaging the score for all found terms.\n\n## Examples\nThe most important thing to keep in mind when creating queries is to keep them simple and organized. Here are some examples of queries that vary in complexity:\n\n**Germ**\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"anti* OR bact* OR germ* OR \\\"anti-bacterial\\\"\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nThis uses simple \"OR\" logic whiile incorporating the wildcard (*) to account for plural versions and typos/misspellings.\n\n**Internet Banking – Mobile Access**\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"((internet OR online OR paperless) AND (bank*)) AND (mobile OR cell* OR phone* OR access*)\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nThis is similar \"OR\" logic and wildcard usage like the last example. The AND operator requires the use of parentheses to keep the desired logic.\n\n**Price (Negative)**\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"(pric* OR cost* OR fee* OR item*) AND (high OR expensive OR premium OR \\\"so much\\\" OR disappoint* OR spendy OR (\\\"too\\\" AND (high OR \\\"much\\\" OR expensive)) OR (\\\"not\\\" AND (good OR competitive* OR worth OR fair))) OR (\\\"too expensive\\\" OR \\\"a little expensive\\\")\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nSometimes, customers have used two separate queries for a single term (i.e. instead of one query for price, there is one for Price (Positive) and one for Price (Negative)). A downside of this system is false positives/negatives can occur. For example, the comment \"it has high quality and reasonable prices\" would attach to Price (Positive) query and the Price (Negative) query, when it belongs with only the Price (Positive) query.\n\n**Price (Negative)**\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"(pric* OR cost* OR fee* OR item*) AND (expensive OR premium OR \\\"so much\\\" OR disappoint* OR (\\\"too\\\" AND (\\\"much\\\" OR expensive)) OR (\\\"not\\\" AND (good OR competitive* OR worth OR fair))) OR ((\\\"too expensive\\\" OR \\\"a little expensive\\\") AND (price* OR cost* OR fee* OR item*)) NEAR/8 (high OR courses)\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nTo fix the problem above, we added an operator at the end of the query, removed \"high\", and added parentheses at the beginning and end of the original query. The \"AND\" and \"NEAR/8\" operators act to nullify the false negative by adding the qualification that high needs to be equal to or less than 8 characters from \"price, cost, fee, or item\".)\n\n## Stopwords\nStopwords remove small and common words which have little effect on the content, like prepositions and conjunctions. In a query, all stopwords must be encapsulated in quotes.\n\n[Download list of stopwords](https://semantria.com/files/stopwords.txt)\n\n## Troubleshooting\nUse the following checklist to validate your queries and avoid errors:\n  * Default max length: 1,500 characters (but may be changed- [contact us](https://semantria.com/contact))\n  * Query cannot be empty\n  * Operators must be CAPITALIZED\n  * *NEAR* accepts values from 1 to 99 (e.g. *NEAR/3*)\n  * Operators are always surrounded by words (e.g. *coffee AND tea AND decaf*)\n  * Double-check opening and closing quotes and parentheses\n  * Query terms cannot contain special characters\n  * Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | \" ' : ; . , < > ? /\n  * Spaces are special characters\n  * Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. \"*#beautifulflowers\", \"customer service\", \"123 Ave Rosemont\", \"rendez-vous\"*)\n  * Queries can contain stopwords, but they must be enclosed in quotes","excerpt":"Queries are useful when looking for specific terms within a document. Using boolean logic, you can search for any type of speech pattern and extract exact phrases while ignoring everything else. Start with short queries at first and add on as you get a feel for how they work.","slug":"queries-query-topics","type":"basic","title":"Queries (Query topics)","__v":0,"childrenPages":[]}

Queries (Query topics)

Queries are useful when looking for specific terms within a document. Using boolean logic, you can search for any type of speech pattern and extract exact phrases while ignoring everything else. Start with short queries at first and add on as you get a feel for how they work.

## General Guidelines * Default max length: 1,500 characters (but may be changed - contact us) * Query cannot be empty * Operators must be CAPITALIZED * NEAR accepts values from 1 to 99 (e.g. NEAR/3) * Operators are always surrounded by words (e.g. coffee AND tea AND decaf) * Double-check opening and closing quotes and parentheses * Query terms cannot contain special characters * Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | " ' : ; . , < > ? / 1 2 3 4 5 6 7 8 9 0 * Spaces are special characters * Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. "#beautifulflowers", "customer service", "123 Ave Rosemont", "rendez-vous") * Queries can contain stopwords, but they must be enclosed in quotes ## Operators Note that operators **must** be capitalized, otherwise they will be treated as a query term. Query operators must also be preceded and followed by query terms or query phrases. ## OR operator Inside a query, the OR operator may be used to retrieve documents containing either of two terms. **Example:** *onions OR cheese* will detect "Onions make my eyes water", "My favorite cheese is cheddar", and "I want cheese and onions on my pizza". ## AND operator Inside a query, the AND operator may be used to retrieve documents containing both specified terms. **Example:** *onions AND cheese* will detect "I want cheese and onions on my pizza" or " I like cheese on my onion rings", but not "Onions make my eyes water" or "My favorite cheese is cheddar." ## NEAR operator A NEAR operator is effectively an AND operator where you can control the distance between the words. *onions NEAR cheese* means that the term cheese must exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* must exist within 50 words of *cheese*. This window can be between 1 and 99. **Other examples include:** *(onions OR bananas) NEAR/5 (cheese OR dinner)* would tag "The banana split was included with dinner" and "The steak dinner with onions was my favorite." This query will **not** detect sentences like "The cheese platter on the dinner menu was superb" or "Bananas, strawberries, and ice cream are not a balanced dinner." *(onions NEAR/5 cheese)* would tag a comment like "Do you want onions on top of your cheese?" but not "Their cheese is my favorite but only on the dish with carmelized onions." [block:callout] { "type": "warning", "title": "Do not use the NEAR operator in the following fashions:", "body": "*\"onions NEAR/10 cheese\"* – this does nothing\n*onions \"NEAR/10\" cheese* – this does nothing" } [/block] ## NOTNEAR operator A NOTNEAR operator is effectively a NOT operator where you can control the distance between the words. *onions NOTNEAR cheese* means that the term cheese cannot exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NOTNEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* cannot exist within 50 words of *cheese*. This window can be between 1 and 99. ## WITH operator A WITH operator requires that the two terms occur within the same sentence. As such, it is the same as a NEAR operator, with the exception that the match window between the two terms is not specified. *"onions WITH cheese"* means that the term cheese must exist within the same sentence as onions.​ ## NOTWITH operator A NOTWITH operator requires that the two terms cannot occur within the same sentence. As such, it is the same as a NOTNEAR operator, with the exception that the match window between the two terms is not specified. *"onions NOTWITH cheese"* means that the term cheese cannot exist within the same sentence as onions.​ ## NOT operator The NOT operator excludes any documents containing the term which follows it. *onions NOT celery* will return all uses of onion, excluding those that contain "celery." A query must contain at least one non-excluded term when using the NOT operator. **Example** *onions NOT celery* will detect "I like onions very much" but not "I like onions on my sandwich and celery on the side." ## EXCLUDE operator Two query terms of any type may be joined by an EXCLUDE operator, e.g. *York EXCLUDE "New York"*. The effect is different than that of the NOT operator. The query will return documents with the word "York", excluding those that only contain occurrences of "New York". **Consider the following sample text:** [block:code] { "codes": [ { "code": "I spent the day in York, visiting the magnificent cathedral. Then it was time to head back to London for my flight home to New York.", "language": "text" } ] } [/block] **This text would generate the following results for the provided queries:** *York NOT "New York"*: FALSE *York EXCLUDE "New York"*: TRUE ## Parentheses Queries can use parentheses to control the logic of the query and they may appear in any combination. **Two examples of queries with smart uses of parentheses are:** *((onions OR cheese) AND celery) NOT horrible (onions OR cheese) NEAR (horrible OR disgusting)* Every left parenthesis must have a corresponding right parenthesis. Queries can have nested parentheses up to 10 levels deep. ## Queries ##Terms and Phrases Single query terms are the simplest query element, consisting of a single word. [block:callout] { "type": "warning", "body": "A query term can be an operator or a word that appears in a stopword list **only** if it is in quotations.", "title": "Query term from stopword list" } [/block] A query term cannot contain punctuation or other special characters like `! @ # $ % ^ ( ) _ = ~ + [ ] { } ( ) | " ' : ; . , < > ? / - Phrases must be enclosed in double quotes. When a single word is enclosed in quotes, it is not treated as a phrase search: it is treated like a single word. ## Wildcards A wildcard character (&#42;) may be used at the end of a single word query term or within a phrase. It allows the system to tag all spellings of the word starting with the letters before the wildcard (&#42;). Wildcards will only work in phrases if they are attached to the last term in the phrase. **For example:** *excit&#42;* would match excite, exciting, excitement, etc. "*running fast&#42;*" would match "running fast" and "running faster". [block:callout] { "type": "warning", "body": "There must be at least a three-letter prefix to a wildcard query. d&#42;, do&#42;, and dog&#42;M are all invalid. Queries like *\"&#42;\"* and *Commonwealth AND *\"&#42;\"* are invalid and achieve nothing." } [/block] ## Nested Queries Referencing a query is done by placing an asterisk (&#42;) that the beginning of a query name and wrapping the asterisk and the query name in parentheses "( )". It signals to the system to look for a query and use it in another query. For example, consider the following queries: **Dirty** *dirty OR filth&#42; OR disgust&#42; OR nasty* **Bathroom** *bathroom&#42; OR toilet&#42; OR restroom&#42; OR lavatory&#42;* **Restaurant_Interior** *restaurant OR table&#42; OR chair&#42; OR carpet&#42; OR furniture&#42; OR plate&#42; OR cup&#42;* Two queries can be combined to create a nested query. **For example:** **Dirty Bathroom** *(&#42;Dirty) AND (&#42;Bathroom)* Query names being nested cannot contain spaces. Only the AND and OR operators function with nested queries. ## Case Sensitivity By default, query terms are handled in a case-insensitive manner. Case-sensitivity on a query term can be enforced using the ~ operator. *~Google NEAR/10 Microsoft* will hit for the phrase "Both tech giants Microsoft and Google are investing heavily in mobile technologies" as well as the phrase "who wins in search, microsoft, bing or google?" ## Stemming By default, query terms are stemmed. For phrase searches, only the right-most word is stemmed. The query process will not stem all words within the multi-word phrase. (e.g. "driving on faster roads" will match *"driving on faster road&#42;"* but will not match *"driving on fast&#42; roads"*). Special characters may be used within query phrases if they are in quotations. **Correct Query:** [block:code] { "codes": [ { "code": "Gepp OR Gunther OR Hasso OR \"Hayden-Smith\" OR Hirakubo OR Kanai OR Mathis OR Moeller OR \"Nijssen_Smith\" OR Sherman OR Shimizu OR \"U'Ren\" OR Daiji", "language": "text", "name": "Correct Query: " } ] } [/block] **Wrong Query:** [block:code] { "codes": [ { "code": "Gepp OR Gunther OR Hasso OR Hayden-Smith OR Hirakubo OR Kanai OR Mathis OR Moeller OR Nijssen_Smith OR Sherman OR Shimizu OR U'Ren OR Daiji", "language": "text", "name": "Wrong Query: " } ] } [/block] ## Accents If a query term is written without accents, the term will match text that has accents. For instance, if your query term is gate, you will also match the text gâté. If you have accents in your query terms, then only the exact form will be match. For instance, if your query term is gâté, you will not match gate. ## Scores Query results will be accompanied with two scores, Query Relevancy and Query Sentiment. ## Query Relevancy Query Relevancy is a count of the query terms found within a document. It can be particularly effective in determining the effectiveness of your queries based on your text. **Consider the following text:** [block:code] { "codes": [ { "code": "I have one cat and I used to have a dog too.", "language": "text" } ] } [/block] The query relevancy score for the query* cat OR dog OR bird *will be 2 because the query detects two of the query terms. ## Query Sentiment Query Sentiment is the sentiment for each query It is calculated by finding the query hits, finding sentiment terms near the hits, and averaging the score for all found terms. ## Examples The most important thing to keep in mind when creating queries is to keep them simple and organized. Here are some examples of queries that vary in complexity: **Germ** [block:code] { "codes": [ { "code": "anti* OR bact* OR germ* OR \"anti-bacterial\"", "language": "text" } ] } [/block] This uses simple "OR" logic whiile incorporating the wildcard (*) to account for plural versions and typos/misspellings. **Internet Banking – Mobile Access** [block:code] { "codes": [ { "code": "((internet OR online OR paperless) AND (bank*)) AND (mobile OR cell* OR phone* OR access*)", "language": "text" } ] } [/block] This is similar "OR" logic and wildcard usage like the last example. The AND operator requires the use of parentheses to keep the desired logic. **Price (Negative)** [block:code] { "codes": [ { "code": "(pric* OR cost* OR fee* OR item*) AND (high OR expensive OR premium OR \"so much\" OR disappoint* OR spendy OR (\"too\" AND (high OR \"much\" OR expensive)) OR (\"not\" AND (good OR competitive* OR worth OR fair))) OR (\"too expensive\" OR \"a little expensive\")", "language": "text" } ] } [/block] Sometimes, customers have used two separate queries for a single term (i.e. instead of one query for price, there is one for Price (Positive) and one for Price (Negative)). A downside of this system is false positives/negatives can occur. For example, the comment "it has high quality and reasonable prices" would attach to Price (Positive) query and the Price (Negative) query, when it belongs with only the Price (Positive) query. **Price (Negative)** [block:code] { "codes": [ { "code": "(pric* OR cost* OR fee* OR item*) AND (expensive OR premium OR \"so much\" OR disappoint* OR (\"too\" AND (\"much\" OR expensive)) OR (\"not\" AND (good OR competitive* OR worth OR fair))) OR ((\"too expensive\" OR \"a little expensive\") AND (price* OR cost* OR fee* OR item*)) NEAR/8 (high OR courses)", "language": "text" } ] } [/block] To fix the problem above, we added an operator at the end of the query, removed "high", and added parentheses at the beginning and end of the original query. The "AND" and "NEAR/8" operators act to nullify the false negative by adding the qualification that high needs to be equal to or less than 8 characters from "price, cost, fee, or item".) ## Stopwords Stopwords remove small and common words which have little effect on the content, like prepositions and conjunctions. In a query, all stopwords must be encapsulated in quotes. [Download list of stopwords](https://semantria.com/files/stopwords.txt) ## Troubleshooting Use the following checklist to validate your queries and avoid errors: * Default max length: 1,500 characters (but may be changed- [contact us](https://semantria.com/contact)) * Query cannot be empty * Operators must be CAPITALIZED * *NEAR* accepts values from 1 to 99 (e.g. *NEAR/3*) * Operators are always surrounded by words (e.g. *coffee AND tea AND decaf*) * Double-check opening and closing quotes and parentheses * Query terms cannot contain special characters * Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | " ' : ; . , < > ? / * Spaces are special characters * Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. "*#beautifulflowers", "customer service", "123 Ave Rosemont", "rendez-vous"*) * Queries can contain stopwords, but they must be enclosed in quotes
## General Guidelines * Default max length: 1,500 characters (but may be changed - contact us) * Query cannot be empty * Operators must be CAPITALIZED * NEAR accepts values from 1 to 99 (e.g. NEAR/3) * Operators are always surrounded by words (e.g. coffee AND tea AND decaf) * Double-check opening and closing quotes and parentheses * Query terms cannot contain special characters * Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | " ' : ; . , < > ? / 1 2 3 4 5 6 7 8 9 0 * Spaces are special characters * Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. "#beautifulflowers", "customer service", "123 Ave Rosemont", "rendez-vous") * Queries can contain stopwords, but they must be enclosed in quotes ## Operators Note that operators **must** be capitalized, otherwise they will be treated as a query term. Query operators must also be preceded and followed by query terms or query phrases. ## OR operator Inside a query, the OR operator may be used to retrieve documents containing either of two terms. **Example:** *onions OR cheese* will detect "Onions make my eyes water", "My favorite cheese is cheddar", and "I want cheese and onions on my pizza". ## AND operator Inside a query, the AND operator may be used to retrieve documents containing both specified terms. **Example:** *onions AND cheese* will detect "I want cheese and onions on my pizza" or " I like cheese on my onion rings", but not "Onions make my eyes water" or "My favorite cheese is cheddar." ## NEAR operator A NEAR operator is effectively an AND operator where you can control the distance between the words. *onions NEAR cheese* means that the term cheese must exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* must exist within 50 words of *cheese*. This window can be between 1 and 99. **Other examples include:** *(onions OR bananas) NEAR/5 (cheese OR dinner)* would tag "The banana split was included with dinner" and "The steak dinner with onions was my favorite." This query will **not** detect sentences like "The cheese platter on the dinner menu was superb" or "Bananas, strawberries, and ice cream are not a balanced dinner." *(onions NEAR/5 cheese)* would tag a comment like "Do you want onions on top of your cheese?" but not "Their cheese is my favorite but only on the dish with carmelized onions." [block:callout] { "type": "warning", "title": "Do not use the NEAR operator in the following fashions:", "body": "*\"onions NEAR/10 cheese\"* – this does nothing\n*onions \"NEAR/10\" cheese* – this does nothing" } [/block] ## NOTNEAR operator A NOTNEAR operator is effectively a NOT operator where you can control the distance between the words. *onions NOTNEAR cheese* means that the term cheese cannot exist within 10 words of onions. The default distance is 10 words, but you can vary the distance the NOTNEAR operation uses by adding a number suffix such as *onions NEAR/50 cheese*, which means the *onion* cannot exist within 50 words of *cheese*. This window can be between 1 and 99. ## WITH operator A WITH operator requires that the two terms occur within the same sentence. As such, it is the same as a NEAR operator, with the exception that the match window between the two terms is not specified. *"onions WITH cheese"* means that the term cheese must exist within the same sentence as onions.​ ## NOTWITH operator A NOTWITH operator requires that the two terms cannot occur within the same sentence. As such, it is the same as a NOTNEAR operator, with the exception that the match window between the two terms is not specified. *"onions NOTWITH cheese"* means that the term cheese cannot exist within the same sentence as onions.​ ## NOT operator The NOT operator excludes any documents containing the term which follows it. *onions NOT celery* will return all uses of onion, excluding those that contain "celery." A query must contain at least one non-excluded term when using the NOT operator. **Example** *onions NOT celery* will detect "I like onions very much" but not "I like onions on my sandwich and celery on the side." ## EXCLUDE operator Two query terms of any type may be joined by an EXCLUDE operator, e.g. *York EXCLUDE "New York"*. The effect is different than that of the NOT operator. The query will return documents with the word "York", excluding those that only contain occurrences of "New York". **Consider the following sample text:** [block:code] { "codes": [ { "code": "I spent the day in York, visiting the magnificent cathedral. Then it was time to head back to London for my flight home to New York.", "language": "text" } ] } [/block] **This text would generate the following results for the provided queries:** *York NOT "New York"*: FALSE *York EXCLUDE "New York"*: TRUE ## Parentheses Queries can use parentheses to control the logic of the query and they may appear in any combination. **Two examples of queries with smart uses of parentheses are:** *((onions OR cheese) AND celery) NOT horrible (onions OR cheese) NEAR (horrible OR disgusting)* Every left parenthesis must have a corresponding right parenthesis. Queries can have nested parentheses up to 10 levels deep. ## Queries ##Terms and Phrases Single query terms are the simplest query element, consisting of a single word. [block:callout] { "type": "warning", "body": "A query term can be an operator or a word that appears in a stopword list **only** if it is in quotations.", "title": "Query term from stopword list" } [/block] A query term cannot contain punctuation or other special characters like `! @ # $ % ^ ( ) _ = ~ + [ ] { } ( ) | " ' : ; . , < > ? / - Phrases must be enclosed in double quotes. When a single word is enclosed in quotes, it is not treated as a phrase search: it is treated like a single word. ## Wildcards A wildcard character (&#42;) may be used at the end of a single word query term or within a phrase. It allows the system to tag all spellings of the word starting with the letters before the wildcard (&#42;). Wildcards will only work in phrases if they are attached to the last term in the phrase. **For example:** *excit&#42;* would match excite, exciting, excitement, etc. "*running fast&#42;*" would match "running fast" and "running faster". [block:callout] { "type": "warning", "body": "There must be at least a three-letter prefix to a wildcard query. d&#42;, do&#42;, and dog&#42;M are all invalid. Queries like *\"&#42;\"* and *Commonwealth AND *\"&#42;\"* are invalid and achieve nothing." } [/block] ## Nested Queries Referencing a query is done by placing an asterisk (&#42;) that the beginning of a query name and wrapping the asterisk and the query name in parentheses "( )". It signals to the system to look for a query and use it in another query. For example, consider the following queries: **Dirty** *dirty OR filth&#42; OR disgust&#42; OR nasty* **Bathroom** *bathroom&#42; OR toilet&#42; OR restroom&#42; OR lavatory&#42;* **Restaurant_Interior** *restaurant OR table&#42; OR chair&#42; OR carpet&#42; OR furniture&#42; OR plate&#42; OR cup&#42;* Two queries can be combined to create a nested query. **For example:** **Dirty Bathroom** *(&#42;Dirty) AND (&#42;Bathroom)* Query names being nested cannot contain spaces. Only the AND and OR operators function with nested queries. ## Case Sensitivity By default, query terms are handled in a case-insensitive manner. Case-sensitivity on a query term can be enforced using the ~ operator. *~Google NEAR/10 Microsoft* will hit for the phrase "Both tech giants Microsoft and Google are investing heavily in mobile technologies" as well as the phrase "who wins in search, microsoft, bing or google?" ## Stemming By default, query terms are stemmed. For phrase searches, only the right-most word is stemmed. The query process will not stem all words within the multi-word phrase. (e.g. "driving on faster roads" will match *"driving on faster road&#42;"* but will not match *"driving on fast&#42; roads"*). Special characters may be used within query phrases if they are in quotations. **Correct Query:** [block:code] { "codes": [ { "code": "Gepp OR Gunther OR Hasso OR \"Hayden-Smith\" OR Hirakubo OR Kanai OR Mathis OR Moeller OR \"Nijssen_Smith\" OR Sherman OR Shimizu OR \"U'Ren\" OR Daiji", "language": "text", "name": "Correct Query: " } ] } [/block] **Wrong Query:** [block:code] { "codes": [ { "code": "Gepp OR Gunther OR Hasso OR Hayden-Smith OR Hirakubo OR Kanai OR Mathis OR Moeller OR Nijssen_Smith OR Sherman OR Shimizu OR U'Ren OR Daiji", "language": "text", "name": "Wrong Query: " } ] } [/block] ## Accents If a query term is written without accents, the term will match text that has accents. For instance, if your query term is gate, you will also match the text gâté. If you have accents in your query terms, then only the exact form will be match. For instance, if your query term is gâté, you will not match gate. ## Scores Query results will be accompanied with two scores, Query Relevancy and Query Sentiment. ## Query Relevancy Query Relevancy is a count of the query terms found within a document. It can be particularly effective in determining the effectiveness of your queries based on your text. **Consider the following text:** [block:code] { "codes": [ { "code": "I have one cat and I used to have a dog too.", "language": "text" } ] } [/block] The query relevancy score for the query* cat OR dog OR bird *will be 2 because the query detects two of the query terms. ## Query Sentiment Query Sentiment is the sentiment for each query It is calculated by finding the query hits, finding sentiment terms near the hits, and averaging the score for all found terms. ## Examples The most important thing to keep in mind when creating queries is to keep them simple and organized. Here are some examples of queries that vary in complexity: **Germ** [block:code] { "codes": [ { "code": "anti* OR bact* OR germ* OR \"anti-bacterial\"", "language": "text" } ] } [/block] This uses simple "OR" logic whiile incorporating the wildcard (*) to account for plural versions and typos/misspellings. **Internet Banking – Mobile Access** [block:code] { "codes": [ { "code": "((internet OR online OR paperless) AND (bank*)) AND (mobile OR cell* OR phone* OR access*)", "language": "text" } ] } [/block] This is similar "OR" logic and wildcard usage like the last example. The AND operator requires the use of parentheses to keep the desired logic. **Price (Negative)** [block:code] { "codes": [ { "code": "(pric* OR cost* OR fee* OR item*) AND (high OR expensive OR premium OR \"so much\" OR disappoint* OR spendy OR (\"too\" AND (high OR \"much\" OR expensive)) OR (\"not\" AND (good OR competitive* OR worth OR fair))) OR (\"too expensive\" OR \"a little expensive\")", "language": "text" } ] } [/block] Sometimes, customers have used two separate queries for a single term (i.e. instead of one query for price, there is one for Price (Positive) and one for Price (Negative)). A downside of this system is false positives/negatives can occur. For example, the comment "it has high quality and reasonable prices" would attach to Price (Positive) query and the Price (Negative) query, when it belongs with only the Price (Positive) query. **Price (Negative)** [block:code] { "codes": [ { "code": "(pric* OR cost* OR fee* OR item*) AND (expensive OR premium OR \"so much\" OR disappoint* OR (\"too\" AND (\"much\" OR expensive)) OR (\"not\" AND (good OR competitive* OR worth OR fair))) OR ((\"too expensive\" OR \"a little expensive\") AND (price* OR cost* OR fee* OR item*)) NEAR/8 (high OR courses)", "language": "text" } ] } [/block] To fix the problem above, we added an operator at the end of the query, removed "high", and added parentheses at the beginning and end of the original query. The "AND" and "NEAR/8" operators act to nullify the false negative by adding the qualification that high needs to be equal to or less than 8 characters from "price, cost, fee, or item".) ## Stopwords Stopwords remove small and common words which have little effect on the content, like prepositions and conjunctions. In a query, all stopwords must be encapsulated in quotes. [Download list of stopwords](https://semantria.com/files/stopwords.txt) ## Troubleshooting Use the following checklist to validate your queries and avoid errors: * Default max length: 1,500 characters (but may be changed- [contact us](https://semantria.com/contact)) * Query cannot be empty * Operators must be CAPITALIZED * *NEAR* accepts values from 1 to 99 (e.g. *NEAR/3*) * Operators are always surrounded by words (e.g. *coffee AND tea AND decaf*) * Double-check opening and closing quotes and parentheses * Query terms cannot contain special characters * Special characters are: `! @ # $ % ^ ( ) _ - = ~ + [ ] { } ( ) | " ' : ; . , < > ? / * Spaces are special characters * Terms containing special characters or phrases containing more than two words should be enclosed (escaped) in quotes (e.g. "*#beautifulflowers", "customer service", "123 Ave Rosemont", "rendez-vous"*) * Queries can contain stopwords, but they must be enclosed in quotes
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2ce","createdAt":"2016-06-10T13:51:30.203Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"name":"","code":"{}","language":"json","status":200},{"name":"","code":"{}","language":"json","status":400}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":96,"body":"Semantria uses a machine learning model to extract entities from the text. These are returned as entities of type \"named.\" This base entity extraction model cannot be tuned by the user but you can add new entities you define, which are returned as type \"user.\"\n\nWhen telling the system what to look for, you can use either exact phrases (Microsoft) or the Boolean query syntax (Microsoft OR MSFT). Custom entities are often used for things like product names or store locations or are used to normalize variations in names to a single one, such as normalizing MSFT and Microsoft to the same name.\n\nAdding an entity consists of configuring:\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Field Name\",\n    \"h-1\": \"Purpose\",\n    \"0-0\": \"Entity\",\n    \"0-1\": \"text to look for\",\n    \"1-0\": \"Label\",\n    \"1-1\": \"A label field returned in the API output to give more information about the entity, such as a link to a Wikipedia page\",\n    \"2-0\": \"Type\",\n    \"2-1\": \"Usually used to differentiate types of entities from each other, such as Company, Beverage, or Competitor.\",\n    \"3-0\": \"Normalized\",\n    \"3-1\": \"You can use this to normalize different forms of the entity to the same value. For instance, one entity might be Coke, and another Coca-Cola. If you enter the same normalized value for each entity, they will appear in the output as the same thing.\"\n  },\n  \"cols\": 2,\n  \"rows\": 4\n}\n[/block]\nIf you use a query in the Entity field, you must also enter a value for the Normalized field.","excerpt":"","slug":"entity-configuration","type":"basic","title":"Entity Configuration","__v":0,"childrenPages":[]}

Entity Configuration


Semantria uses a machine learning model to extract entities from the text. These are returned as entities of type "named." This base entity extraction model cannot be tuned by the user but you can add new entities you define, which are returned as type "user." When telling the system what to look for, you can use either exact phrases (Microsoft) or the Boolean query syntax (Microsoft OR MSFT). Custom entities are often used for things like product names or store locations or are used to normalize variations in names to a single one, such as normalizing MSFT and Microsoft to the same name. Adding an entity consists of configuring: [block:parameters] { "data": { "h-0": "Field Name", "h-1": "Purpose", "0-0": "Entity", "0-1": "text to look for", "1-0": "Label", "1-1": "A label field returned in the API output to give more information about the entity, such as a link to a Wikipedia page", "2-0": "Type", "2-1": "Usually used to differentiate types of entities from each other, such as Company, Beverage, or Competitor.", "3-0": "Normalized", "3-1": "You can use this to normalize different forms of the entity to the same value. For instance, one entity might be Coke, and another Coca-Cola. If you enter the same normalized value for each entity, they will appear in the output as the same thing." }, "cols": 2, "rows": 4 } [/block] If you use a query in the Entity field, you must also enter a value for the Normalized field.
Semantria uses a machine learning model to extract entities from the text. These are returned as entities of type "named." This base entity extraction model cannot be tuned by the user but you can add new entities you define, which are returned as type "user." When telling the system what to look for, you can use either exact phrases (Microsoft) or the Boolean query syntax (Microsoft OR MSFT). Custom entities are often used for things like product names or store locations or are used to normalize variations in names to a single one, such as normalizing MSFT and Microsoft to the same name. Adding an entity consists of configuring: [block:parameters] { "data": { "h-0": "Field Name", "h-1": "Purpose", "0-0": "Entity", "0-1": "text to look for", "1-0": "Label", "1-1": "A label field returned in the API output to give more information about the entity, such as a link to a Wikipedia page", "2-0": "Type", "2-1": "Usually used to differentiate types of entities from each other, such as Company, Beverage, or Competitor.", "3-0": "Normalized", "3-1": "You can use this to normalize different forms of the entity to the same value. For instance, one entity might be Coke, and another Coca-Cola. If you enter the same normalized value for each entity, they will appear in the output as the same thing." }, "cols": 2, "rows": 4 } [/block] If you use a query in the Entity field, you must also enter a value for the Normalized field.
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2cf","createdAt":"2015-07-07T21:44:07.100Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":97,"body":"Categories are best for catching broad concepts to help sort your content. They use our Concept Matrix for categorization and to ease the creation of categories. Semantria categorizes out of the box with over 400 auto-categories pre-loaded. Users can also create custom categories to suit their particular data needs.\nSemantria's Categories are based on the Concept Matrix, which was created by processing the entire Wikipedia taxonomy. Instead of searching for specific terms like Queries, Categories support conceptual matches. Categories will return documents related to your query, even if different terms are used. For example, a query for 'food' in a Boolean search engine would return only articles using the specific word 'food.' With a category you'd also get results that discuss pizza, sandwiches, and other food-related words.\n[Categorization tips and tricks >>](#tip-and-tricks)\n##Auto-categories\nSemantria offers an intelligent categorizer that maps users' content against all of Wikipedia's taxonomies. There are approximately 400 first-level auto-categories and more than 4000 second-level auto-categories. Semantria API returns a \"relevancy score,\" which is a score between 0 and 1 that represents how confident Semantria is about whether the document falls into that category.\n[Download full auto-category taxonomy](https://semantria.com/files/taxonomy.txt)\n##User categories\nUsers can create their own categories to extract the exact information they're looking for. Semantria ships with a default list of 40 categories, but users may create as many new categories as they would like. To create a new category, give Semantria the name of your category as well as 4-6 very obvious examples of your category. For example, if you wanted to create a \"Vegan\" category, some good sample words might be \"vegan, diet, vegetarian, animal, tofu, veganism.\" Semantria will use Wikipedia's ontology to categorize sentences and comments into different categories. Read more about this on our technology page.\nTwo examples of pre-generated User Categories include Agriculture (farming, agriculture, farmer) and Food (food, meals, vegetables, meat, fruit). Here is how the following categories match these example sentences:\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Sentence for Analysis\",\n    \"h-1\": \"Food\",\n    \"h-2\": \"Agriculture\",\n    \"0-0\": \"I like chicken.\",\n    \"0-1\": \"0.58\",\n    \"0-2\": \"No match\",\n    \"1-0\": \"I like chickens.\",\n    \"1-1\": \"No match\",\n    \"1-2\": \"0.71\",\n    \"2-0\": \"I like to eat chicken.\",\n    \"2-1\": \"0.59\",\n    \"2-2\": \"0.51\"\n  },\n  \"cols\": 3,\n  \"rows\": 3\n}\n[/block]\nNotice the different scores based on the different categories. Download the list of [default user categories ](https://www.lexalytics.com/files/usercategories.txt)as a template for your own user categories.\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"The default user categories provided are not comprehensive. They should serve as examples for your own user categories.\"\n}\n[/block]\n##Relevancy Score\nSemantria retuns a relevancy score with every entry and an associated category in an analysis. This score from 0 to 1 represents how confident Semantria is that the entry fits in the listed category. A low score implies low certainty and a high score implies confidence.\n\n##Weights\nThe weight affects the category relevancy score of a given category and has no bearing on the categorization algorithm. The relevancy score is calculated based on its own metrics and determines the engine’s confidence of a given category in the text. By default, Semantria has a concept threshold that leads the engine to drop relevancy scores that are lower than the threshold limit because it is not confident in the category. Out-of-the-box, the threshold limit is 0.45. Any categories with a relevancy score lower than 0.45 will not be reported in the output. The weight allows the user to adjust the relevancy score so certain categories will clear the threshold and appear in output. If a category is not returned in the output, it means its relevancy score did not reach the minimum threshold.\nUsing the default threshold of 0.45. Assume that the category “Stock Exchange” has a relevancy score of 0.3. With a score of 0.3, the category will not be shown in the output. However, it is important for the user to see the “Stock Exchange” category output. By adjusting the weight of the category and defining it as 1.7, the engine will multiply it by the weight with the strength score:\n\n0.3*1.7 = 0.51\n\nNow with a relevancy score of 0.51, the “Stock Exchange” category will be seen in the output.\nIn another example, for the category “Pharmaceuticals”, the weight is 0.95 and the strength score is 0.35.\n\n0.35*0.95 = 0.33\n\nThe “Pharmaceuticals” category will not appear in the output. But by changing the weight to 1.5:\n\n0.35*1.5 = 0.53\n\nNow the “Pharmaceuticals” category will appear in the output.\nIf the weight of a category is not defined, Semantria will report the category as if its relevancy score is higher than 0.45. Alternatively, if the category weight is set to 0, the relevancy score will also be 0 and there will be no output for the specific category.\nThere are no boundaries for the weight, so it depends on the specifics of the document and categories. However, best practices recommends weights between 0.85 and 2.\n\n##Category definition syntax\n**Text**\n\nThe basic category definition syntax is simply words and phrases expressing the idea you'd like to match. You can use commas to break up separate ideas. 'oil paint' is looking for articles about the artistic medium. While 'oil, paint' is looking for articles about petroleum and/or any type of paint.\nThe list can be as long as you'd like, although many short queries usually outperform very long, detailed ones. A perfect match would be related to all the terms given, but partial matches can occur where the article is only related to a subset of the terms.\n\n**Underscore**\nWhen the concept matrix is given a phrase, it matches both the phrase form as well as the individual words. Thus 'power plant', while matching stories about electric generation most strongly, may also pull in articles about plant life. In most queries the individual words in a phrase are related and contribute positively. But in cases where the individual words mean something different on their own, underscore instructs the engine to only use the phrase form. Thus 'power_plant' will not match articles about flowers at all.\n\n**NOT operator**\nNOT excludes certain ideas from consideration. This operator is primarily intended for narrowing down the meanings of words and phrases, or otherwise limiting the scope implied by a word or phrase. When using bank as a sample for a ‘Financial’ category may match articles related to mortgages, finances, but also to riverbanks. To avoid this scenario, using the NOT operator instructs the engine to ignore any mention of word after NOT. Therefore, ‘bank NOT river’ would match all articles related to finance but exclude any articles related to a riverbank.\n\n**CONTEXT operator**\nThe CONTEXT operator is the opposite of the NOT operator. While NOT excludes certain ideas implied by a definition, CONTEXT highlights certain ideas.\nAssume you are interested in automobile manufacturing. The definition 'automobile, manufacturing' is likely to get relevant results, but may also pull in articles about manufacturing in general. The query 'automobile_manufacturing' is highly specific, but possibly overly so.\nBy using the CONTEXT operator, the text to the left of CONTEXT supplies the general idea being searched for, and the text to the right supplies the ideas you want that topic to be discussed with. Therefore, ‘automobile CONTEXT manufacturing’ will result in a search for automotive in general, with a focus on manufacturing. It will not return results just about manufacturing.\n\n**Boolean Filter Queries**\nIn addition to a list of terms to aid in defining the concept you are looking to match, boolean logic can be included in a category definition to provide a level of specific filtering for concept matches. Boolean queries are enclosed within [ ] brackets in the category definition. For example, [(pizza AND seafood) NOT (shrimp OR “king crab”)] will match and categorize content that discusses pizza with seafood, but doesn’t contain shrimp or king crab.\n\n##When to use\nThe following example from a Tripadvisor review is a good illustration of the differences and advantages of categories versus query-based categorizers when identifying broad concepts.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"We were on the ship by 11:45. About 10 minutes after my VIP parents. Went to Lido deck for lunch. It was hard to find a table for ten or two table near each other so we went to the back of the ship near the pizza. The ship was full and you could tell. Very crowded. We got to our room at 1:30 met Edouardo our room steward. Loved him got everyone's name but had a hard time with mine and called me misses all week. The room was the same as on the Splendor which we did two years ago. 4 of us had plenty of room and loved the balcony. We had early seating in Washington Dining room 3rd floor in the middle so no view for us. I wanted to do anytime dining but with ten it wouldn't of worked. The boys and us went to club sign up. My 14 year old was going to be 15 in September and I wanted him moved to club O2 they said to write it on the form. He was able to switch no problem. Club started at 9:00pm. They had a great time all week and came home 12:30-1:00 every night.\",\n      \"language\": \"text\"\n    }\n  ]\n}\n[/block]\nWith Categories, the user could have searched for \"travel\" or \"tourism\" categories and found this passage in about 10 seconds. A Query search would have taken much longer to find this passage and would have been less successful. A search could have been \"motel OR hotel OR show OR resort OR pool OR travel OR vacation;\" It would have taken about 5 minutes to define and would not have detected that this passage was about travel.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Tip and tricks\"\n}\n[/block]\n  * Use categories only for broad and generic categorization. Search for concepts or groups of things, like car, airplanes, politics, or arts.\n  * A rule of thumb is if there is no category in Wikipedia, don't build such a category. Instead, use Queries.\n  * Provide 4-6 very obvious sample words when creating a category.\n  * If a category is not showing up, increase the category weight.","excerpt":"","slug":"user-categories-concept-topics","type":"basic","title":"User categories (Concept topics)","__v":0,"childrenPages":[]}

User categories (Concept topics)


Categories are best for catching broad concepts to help sort your content. They use our Concept Matrix for categorization and to ease the creation of categories. Semantria categorizes out of the box with over 400 auto-categories pre-loaded. Users can also create custom categories to suit their particular data needs. Semantria's Categories are based on the Concept Matrix, which was created by processing the entire Wikipedia taxonomy. Instead of searching for specific terms like Queries, Categories support conceptual matches. Categories will return documents related to your query, even if different terms are used. For example, a query for 'food' in a Boolean search engine would return only articles using the specific word 'food.' With a category you'd also get results that discuss pizza, sandwiches, and other food-related words. [Categorization tips and tricks >>](#tip-and-tricks) ##Auto-categories Semantria offers an intelligent categorizer that maps users' content against all of Wikipedia's taxonomies. There are approximately 400 first-level auto-categories and more than 4000 second-level auto-categories. Semantria API returns a "relevancy score," which is a score between 0 and 1 that represents how confident Semantria is about whether the document falls into that category. [Download full auto-category taxonomy](https://semantria.com/files/taxonomy.txt) ##User categories Users can create their own categories to extract the exact information they're looking for. Semantria ships with a default list of 40 categories, but users may create as many new categories as they would like. To create a new category, give Semantria the name of your category as well as 4-6 very obvious examples of your category. For example, if you wanted to create a "Vegan" category, some good sample words might be "vegan, diet, vegetarian, animal, tofu, veganism." Semantria will use Wikipedia's ontology to categorize sentences and comments into different categories. Read more about this on our technology page. Two examples of pre-generated User Categories include Agriculture (farming, agriculture, farmer) and Food (food, meals, vegetables, meat, fruit). Here is how the following categories match these example sentences: [block:parameters] { "data": { "h-0": "Sentence for Analysis", "h-1": "Food", "h-2": "Agriculture", "0-0": "I like chicken.", "0-1": "0.58", "0-2": "No match", "1-0": "I like chickens.", "1-1": "No match", "1-2": "0.71", "2-0": "I like to eat chicken.", "2-1": "0.59", "2-2": "0.51" }, "cols": 3, "rows": 3 } [/block] Notice the different scores based on the different categories. Download the list of [default user categories ](https://www.lexalytics.com/files/usercategories.txt)as a template for your own user categories. [block:callout] { "type": "warning", "body": "The default user categories provided are not comprehensive. They should serve as examples for your own user categories." } [/block] ##Relevancy Score Semantria retuns a relevancy score with every entry and an associated category in an analysis. This score from 0 to 1 represents how confident Semantria is that the entry fits in the listed category. A low score implies low certainty and a high score implies confidence. ##Weights The weight affects the category relevancy score of a given category and has no bearing on the categorization algorithm. The relevancy score is calculated based on its own metrics and determines the engine’s confidence of a given category in the text. By default, Semantria has a concept threshold that leads the engine to drop relevancy scores that are lower than the threshold limit because it is not confident in the category. Out-of-the-box, the threshold limit is 0.45. Any categories with a relevancy score lower than 0.45 will not be reported in the output. The weight allows the user to adjust the relevancy score so certain categories will clear the threshold and appear in output. If a category is not returned in the output, it means its relevancy score did not reach the minimum threshold. Using the default threshold of 0.45. Assume that the category “Stock Exchange” has a relevancy score of 0.3. With a score of 0.3, the category will not be shown in the output. However, it is important for the user to see the “Stock Exchange” category output. By adjusting the weight of the category and defining it as 1.7, the engine will multiply it by the weight with the strength score: 0.3*1.7 = 0.51 Now with a relevancy score of 0.51, the “Stock Exchange” category will be seen in the output. In another example, for the category “Pharmaceuticals”, the weight is 0.95 and the strength score is 0.35. 0.35*0.95 = 0.33 The “Pharmaceuticals” category will not appear in the output. But by changing the weight to 1.5: 0.35*1.5 = 0.53 Now the “Pharmaceuticals” category will appear in the output. If the weight of a category is not defined, Semantria will report the category as if its relevancy score is higher than 0.45. Alternatively, if the category weight is set to 0, the relevancy score will also be 0 and there will be no output for the specific category. There are no boundaries for the weight, so it depends on the specifics of the document and categories. However, best practices recommends weights between 0.85 and 2. ##Category definition syntax **Text** The basic category definition syntax is simply words and phrases expressing the idea you'd like to match. You can use commas to break up separate ideas. 'oil paint' is looking for articles about the artistic medium. While 'oil, paint' is looking for articles about petroleum and/or any type of paint. The list can be as long as you'd like, although many short queries usually outperform very long, detailed ones. A perfect match would be related to all the terms given, but partial matches can occur where the article is only related to a subset of the terms. **Underscore** When the concept matrix is given a phrase, it matches both the phrase form as well as the individual words. Thus 'power plant', while matching stories about electric generation most strongly, may also pull in articles about plant life. In most queries the individual words in a phrase are related and contribute positively. But in cases where the individual words mean something different on their own, underscore instructs the engine to only use the phrase form. Thus 'power_plant' will not match articles about flowers at all. **NOT operator** NOT excludes certain ideas from consideration. This operator is primarily intended for narrowing down the meanings of words and phrases, or otherwise limiting the scope implied by a word or phrase. When using bank as a sample for a ‘Financial’ category may match articles related to mortgages, finances, but also to riverbanks. To avoid this scenario, using the NOT operator instructs the engine to ignore any mention of word after NOT. Therefore, ‘bank NOT river’ would match all articles related to finance but exclude any articles related to a riverbank. **CONTEXT operator** The CONTEXT operator is the opposite of the NOT operator. While NOT excludes certain ideas implied by a definition, CONTEXT highlights certain ideas. Assume you are interested in automobile manufacturing. The definition 'automobile, manufacturing' is likely to get relevant results, but may also pull in articles about manufacturing in general. The query 'automobile_manufacturing' is highly specific, but possibly overly so. By using the CONTEXT operator, the text to the left of CONTEXT supplies the general idea being searched for, and the text to the right supplies the ideas you want that topic to be discussed with. Therefore, ‘automobile CONTEXT manufacturing’ will result in a search for automotive in general, with a focus on manufacturing. It will not return results just about manufacturing. **Boolean Filter Queries** In addition to a list of terms to aid in defining the concept you are looking to match, boolean logic can be included in a category definition to provide a level of specific filtering for concept matches. Boolean queries are enclosed within [ ] brackets in the category definition. For example, [(pizza AND seafood) NOT (shrimp OR “king crab”)] will match and categorize content that discusses pizza with seafood, but doesn’t contain shrimp or king crab. ##When to use The following example from a Tripadvisor review is a good illustration of the differences and advantages of categories versus query-based categorizers when identifying broad concepts. [block:code] { "codes": [ { "code": "We were on the ship by 11:45. About 10 minutes after my VIP parents. Went to Lido deck for lunch. It was hard to find a table for ten or two table near each other so we went to the back of the ship near the pizza. The ship was full and you could tell. Very crowded. We got to our room at 1:30 met Edouardo our room steward. Loved him got everyone's name but had a hard time with mine and called me misses all week. The room was the same as on the Splendor which we did two years ago. 4 of us had plenty of room and loved the balcony. We had early seating in Washington Dining room 3rd floor in the middle so no view for us. I wanted to do anytime dining but with ten it wouldn't of worked. The boys and us went to club sign up. My 14 year old was going to be 15 in September and I wanted him moved to club O2 they said to write it on the form. He was able to switch no problem. Club started at 9:00pm. They had a great time all week and came home 12:30-1:00 every night.", "language": "text" } ] } [/block] With Categories, the user could have searched for "travel" or "tourism" categories and found this passage in about 10 seconds. A Query search would have taken much longer to find this passage and would have been less successful. A search could have been "motel OR hotel OR show OR resort OR pool OR travel OR vacation;" It would have taken about 5 minutes to define and would not have detected that this passage was about travel. [block:api-header] { "type": "basic", "title": "Tip and tricks" } [/block] * Use categories only for broad and generic categorization. Search for concepts or groups of things, like car, airplanes, politics, or arts. * A rule of thumb is if there is no category in Wikipedia, don't build such a category. Instead, use Queries. * Provide 4-6 very obvious sample words when creating a category. * If a category is not showing up, increase the category weight.
Categories are best for catching broad concepts to help sort your content. They use our Concept Matrix for categorization and to ease the creation of categories. Semantria categorizes out of the box with over 400 auto-categories pre-loaded. Users can also create custom categories to suit their particular data needs. Semantria's Categories are based on the Concept Matrix, which was created by processing the entire Wikipedia taxonomy. Instead of searching for specific terms like Queries, Categories support conceptual matches. Categories will return documents related to your query, even if different terms are used. For example, a query for 'food' in a Boolean search engine would return only articles using the specific word 'food.' With a category you'd also get results that discuss pizza, sandwiches, and other food-related words. [Categorization tips and tricks >>](#tip-and-tricks) ##Auto-categories Semantria offers an intelligent categorizer that maps users' content against all of Wikipedia's taxonomies. There are approximately 400 first-level auto-categories and more than 4000 second-level auto-categories. Semantria API returns a "relevancy score," which is a score between 0 and 1 that represents how confident Semantria is about whether the document falls into that category. [Download full auto-category taxonomy](https://semantria.com/files/taxonomy.txt) ##User categories Users can create their own categories to extract the exact information they're looking for. Semantria ships with a default list of 40 categories, but users may create as many new categories as they would like. To create a new category, give Semantria the name of your category as well as 4-6 very obvious examples of your category. For example, if you wanted to create a "Vegan" category, some good sample words might be "vegan, diet, vegetarian, animal, tofu, veganism." Semantria will use Wikipedia's ontology to categorize sentences and comments into different categories. Read more about this on our technology page. Two examples of pre-generated User Categories include Agriculture (farming, agriculture, farmer) and Food (food, meals, vegetables, meat, fruit). Here is how the following categories match these example sentences: [block:parameters] { "data": { "h-0": "Sentence for Analysis", "h-1": "Food", "h-2": "Agriculture", "0-0": "I like chicken.", "0-1": "0.58", "0-2": "No match", "1-0": "I like chickens.", "1-1": "No match", "1-2": "0.71", "2-0": "I like to eat chicken.", "2-1": "0.59", "2-2": "0.51" }, "cols": 3, "rows": 3 } [/block] Notice the different scores based on the different categories. Download the list of [default user categories ](https://www.lexalytics.com/files/usercategories.txt)as a template for your own user categories. [block:callout] { "type": "warning", "body": "The default user categories provided are not comprehensive. They should serve as examples for your own user categories." } [/block] ##Relevancy Score Semantria retuns a relevancy score with every entry and an associated category in an analysis. This score from 0 to 1 represents how confident Semantria is that the entry fits in the listed category. A low score implies low certainty and a high score implies confidence. ##Weights The weight affects the category relevancy score of a given category and has no bearing on the categorization algorithm. The relevancy score is calculated based on its own metrics and determines the engine’s confidence of a given category in the text. By default, Semantria has a concept threshold that leads the engine to drop relevancy scores that are lower than the threshold limit because it is not confident in the category. Out-of-the-box, the threshold limit is 0.45. Any categories with a relevancy score lower than 0.45 will not be reported in the output. The weight allows the user to adjust the relevancy score so certain categories will clear the threshold and appear in output. If a category is not returned in the output, it means its relevancy score did not reach the minimum threshold. Using the default threshold of 0.45. Assume that the category “Stock Exchange” has a relevancy score of 0.3. With a score of 0.3, the category will not be shown in the output. However, it is important for the user to see the “Stock Exchange” category output. By adjusting the weight of the category and defining it as 1.7, the engine will multiply it by the weight with the strength score: 0.3*1.7 = 0.51 Now with a relevancy score of 0.51, the “Stock Exchange” category will be seen in the output. In another example, for the category “Pharmaceuticals”, the weight is 0.95 and the strength score is 0.35. 0.35*0.95 = 0.33 The “Pharmaceuticals” category will not appear in the output. But by changing the weight to 1.5: 0.35*1.5 = 0.53 Now the “Pharmaceuticals” category will appear in the output. If the weight of a category is not defined, Semantria will report the category as if its relevancy score is higher than 0.45. Alternatively, if the category weight is set to 0, the relevancy score will also be 0 and there will be no output for the specific category. There are no boundaries for the weight, so it depends on the specifics of the document and categories. However, best practices recommends weights between 0.85 and 2. ##Category definition syntax **Text** The basic category definition syntax is simply words and phrases expressing the idea you'd like to match. You can use commas to break up separate ideas. 'oil paint' is looking for articles about the artistic medium. While 'oil, paint' is looking for articles about petroleum and/or any type of paint. The list can be as long as you'd like, although many short queries usually outperform very long, detailed ones. A perfect match would be related to all the terms given, but partial matches can occur where the article is only related to a subset of the terms. **Underscore** When the concept matrix is given a phrase, it matches both the phrase form as well as the individual words. Thus 'power plant', while matching stories about electric generation most strongly, may also pull in articles about plant life. In most queries the individual words in a phrase are related and contribute positively. But in cases where the individual words mean something different on their own, underscore instructs the engine to only use the phrase form. Thus 'power_plant' will not match articles about flowers at all. **NOT operator** NOT excludes certain ideas from consideration. This operator is primarily intended for narrowing down the meanings of words and phrases, or otherwise limiting the scope implied by a word or phrase. When using bank as a sample for a ‘Financial’ category may match articles related to mortgages, finances, but also to riverbanks. To avoid this scenario, using the NOT operator instructs the engine to ignore any mention of word after NOT. Therefore, ‘bank NOT river’ would match all articles related to finance but exclude any articles related to a riverbank. **CONTEXT operator** The CONTEXT operator is the opposite of the NOT operator. While NOT excludes certain ideas implied by a definition, CONTEXT highlights certain ideas. Assume you are interested in automobile manufacturing. The definition 'automobile, manufacturing' is likely to get relevant results, but may also pull in articles about manufacturing in general. The query 'automobile_manufacturing' is highly specific, but possibly overly so. By using the CONTEXT operator, the text to the left of CONTEXT supplies the general idea being searched for, and the text to the right supplies the ideas you want that topic to be discussed with. Therefore, ‘automobile CONTEXT manufacturing’ will result in a search for automotive in general, with a focus on manufacturing. It will not return results just about manufacturing. **Boolean Filter Queries** In addition to a list of terms to aid in defining the concept you are looking to match, boolean logic can be included in a category definition to provide a level of specific filtering for concept matches. Boolean queries are enclosed within [ ] brackets in the category definition. For example, [(pizza AND seafood) NOT (shrimp OR “king crab”)] will match and categorize content that discusses pizza with seafood, but doesn’t contain shrimp or king crab. ##When to use The following example from a Tripadvisor review is a good illustration of the differences and advantages of categories versus query-based categorizers when identifying broad concepts. [block:code] { "codes": [ { "code": "We were on the ship by 11:45. About 10 minutes after my VIP parents. Went to Lido deck for lunch. It was hard to find a table for ten or two table near each other so we went to the back of the ship near the pizza. The ship was full and you could tell. Very crowded. We got to our room at 1:30 met Edouardo our room steward. Loved him got everyone's name but had a hard time with mine and called me misses all week. The room was the same as on the Splendor which we did two years ago. 4 of us had plenty of room and loved the balcony. We had early seating in Washington Dining room 3rd floor in the middle so no view for us. I wanted to do anytime dining but with ten it wouldn't of worked. The boys and us went to club sign up. My 14 year old was going to be 15 in September and I wanted him moved to club O2 they said to write it on the form. He was able to switch no problem. Club started at 9:00pm. They had a great time all week and came home 12:30-1:00 every night.", "language": "text" } ] } [/block] With Categories, the user could have searched for "travel" or "tourism" categories and found this passage in about 10 seconds. A Query search would have taken much longer to find this passage and would have been less successful. A search could have been "motel OR hotel OR show OR resort OR pool OR travel OR vacation;" It would have taken about 5 minutes to define and would not have detected that this passage was about travel. [block:api-header] { "type": "basic", "title": "Tip and tricks" } [/block] * Use categories only for broad and generic categorization. Search for concepts or groups of things, like car, airplanes, politics, or arts. * A rule of thumb is if there is no category in Wikipedia, don't build such a category. Instead, use Queries. * Provide 4-6 very obvious sample words when creating a category. * If a category is not showing up, increase the category weight.
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d0","createdAt":"2015-09-04T18:03:09.855Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":98,"body":"The blacklist feature in Semantria allows you to suppress output that you find irrelevant for your application. For instance, in financial news, phrases like \"last quarter\" and \"first quarter\" will commonly be extracted as themes. If you are uninterested in seeing these themes, you can add them to the blacklist. \n\nThe blacklist words are matched against themes, facets, queries and entities. This means that if you put \"quarter\" into your blacklist, you will no longer see themes of \"last quarter\" or \"this quarter\" and you will also not see an entity of \"No Quarter.\"","excerpt":"How to suppress things","slug":"blacklist","type":"basic","title":"Blacklist","__v":0,"childrenPages":[]}

Blacklist

How to suppress things

The blacklist feature in Semantria allows you to suppress output that you find irrelevant for your application. For instance, in financial news, phrases like "last quarter" and "first quarter" will commonly be extracted as themes. If you are uninterested in seeing these themes, you can add them to the blacklist. The blacklist words are matched against themes, facets, queries and entities. This means that if you put "quarter" into your blacklist, you will no longer see themes of "last quarter" or "this quarter" and you will also not see an entity of "No Quarter."
The blacklist feature in Semantria allows you to suppress output that you find irrelevant for your application. For instance, in financial news, phrases like "last quarter" and "first quarter" will commonly be extracted as themes. If you are uninterested in seeing these themes, you can add them to the blacklist. The blacklist words are matched against themes, facets, queries and entities. This means that if you put "quarter" into your blacklist, you will no longer see themes of "last quarter" or "this quarter" and you will also not see an entity of "No Quarter."
{"category":"577e4bf24159cd1900d5d2bc","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf24159cd1900d5d2d1","createdAt":"2015-10-21T16:31:07.900Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":99,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"What is a taxonomy?\"\n}\n[/block]\nTaxonomies allow you to arrange your queries and categories into a hierarchical structure and this hierarchical structure is reflected in the output. You create nodes in your taxonomy, which function as folders. You can mix and match nodes, categories and queries within a node. By default, a taxonomy can be up to 5 levels deep and contain up to the number of queries and categories allowed in your configuration.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Why have a taxonomy?\"\n}\n[/block]\nTaxonomies allow you to build more complex categorization structures for your content, as well as mixing queries and categories. Categories are more useful for matching broad areas such as Sports, while queries are better at finding specific things, like \"Yankees.\"\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"How do I create a taxonomy?\"\n}\n[/block]\nThe easiest way to create a taxonomy is through SWEB, our online configuration management tool. However, we also have an endpoint, api.semantria.com/taxonomy, where you can create a new taxonomy via programmatic means. \n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"What output do I get from a taxonomy?\"\n}\n[/block]\nEvery query or category that matches a document will be provided in the output under a special “taxonomy” section of the output. In addition, the structure of the taxonomy nodes containing the matching queries and categories will also be printed. \n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"What Makes A Match?\"\n}\n[/block]\nMatches are calculated on the queries and categories in a node. If any of the queries or categories in a node match a document, the node will be returned in the output. For instance, if you have two queries, Cats and Dogs contained in a node called Pets, if either Cats or Dogs matches a document, the node Pets will be returned. If a node contains a child node, by default the child node only will be returned if the parent node is also returned. For example if you have a taxonomy like this:\n\nNode name=Pets\n   topic type=query name=Pets query=“pet OR domesticated”\n   Node name = Cats\n      topic type=query name=Cats query=“cat OR feline”\n   Node name= Dogs\n      topic type=query name=Dogs query=“dog OR canine”\n\nA document with the text “Cats are awesome” will not be a match for the node Cats, even though the query topic “Cats” is a hit for the document. The reason this is not a match is that the parent node, Pets, has a query topic associated with it, Pets, that was not matched. If the document read “I enjoy having pets and cats are awesome” then the node Cats will be matched and returned since its parent node, Pets, also matches. \n\nWhether parent matching is enforced is controllable at the node level.","excerpt":"","slug":"taxonomy","type":"basic","title":"Taxonomy","__v":0,"childrenPages":[]}

Taxonomy


[block:api-header] { "type": "basic", "title": "What is a taxonomy?" } [/block] Taxonomies allow you to arrange your queries and categories into a hierarchical structure and this hierarchical structure is reflected in the output. You create nodes in your taxonomy, which function as folders. You can mix and match nodes, categories and queries within a node. By default, a taxonomy can be up to 5 levels deep and contain up to the number of queries and categories allowed in your configuration. [block:api-header] { "type": "basic", "title": "Why have a taxonomy?" } [/block] Taxonomies allow you to build more complex categorization structures for your content, as well as mixing queries and categories. Categories are more useful for matching broad areas such as Sports, while queries are better at finding specific things, like "Yankees." [block:api-header] { "type": "basic", "title": "How do I create a taxonomy?" } [/block] The easiest way to create a taxonomy is through SWEB, our online configuration management tool. However, we also have an endpoint, api.semantria.com/taxonomy, where you can create a new taxonomy via programmatic means. [block:api-header] { "type": "basic", "title": "What output do I get from a taxonomy?" } [/block] Every query or category that matches a document will be provided in the output under a special “taxonomy” section of the output. In addition, the structure of the taxonomy nodes containing the matching queries and categories will also be printed. [block:api-header] { "type": "basic", "title": "What Makes A Match?" } [/block] Matches are calculated on the queries and categories in a node. If any of the queries or categories in a node match a document, the node will be returned in the output. For instance, if you have two queries, Cats and Dogs contained in a node called Pets, if either Cats or Dogs matches a document, the node Pets will be returned. If a node contains a child node, by default the child node only will be returned if the parent node is also returned. For example if you have a taxonomy like this: Node name=Pets topic type=query name=Pets query=“pet OR domesticated” Node name = Cats topic type=query name=Cats query=“cat OR feline” Node name= Dogs topic type=query name=Dogs query=“dog OR canine” A document with the text “Cats are awesome” will not be a match for the node Cats, even though the query topic “Cats” is a hit for the document. The reason this is not a match is that the parent node, Pets, has a query topic associated with it, Pets, that was not matched. If the document read “I enjoy having pets and cats are awesome” then the node Cats will be matched and returned since its parent node, Pets, also matches. Whether parent matching is enforced is controllable at the node level.
[block:api-header] { "type": "basic", "title": "What is a taxonomy?" } [/block] Taxonomies allow you to arrange your queries and categories into a hierarchical structure and this hierarchical structure is reflected in the output. You create nodes in your taxonomy, which function as folders. You can mix and match nodes, categories and queries within a node. By default, a taxonomy can be up to 5 levels deep and contain up to the number of queries and categories allowed in your configuration. [block:api-header] { "type": "basic", "title": "Why have a taxonomy?" } [/block] Taxonomies allow you to build more complex categorization structures for your content, as well as mixing queries and categories. Categories are more useful for matching broad areas such as Sports, while queries are better at finding specific things, like "Yankees." [block:api-header] { "type": "basic", "title": "How do I create a taxonomy?" } [/block] The easiest way to create a taxonomy is through SWEB, our online configuration management tool. However, we also have an endpoint, api.semantria.com/taxonomy, where you can create a new taxonomy via programmatic means. [block:api-header] { "type": "basic", "title": "What output do I get from a taxonomy?" } [/block] Every query or category that matches a document will be provided in the output under a special “taxonomy” section of the output. In addition, the structure of the taxonomy nodes containing the matching queries and categories will also be printed. [block:api-header] { "type": "basic", "title": "What Makes A Match?" } [/block] Matches are calculated on the queries and categories in a node. If any of the queries or categories in a node match a document, the node will be returned in the output. For instance, if you have two queries, Cats and Dogs contained in a node called Pets, if either Cats or Dogs matches a document, the node Pets will be returned. If a node contains a child node, by default the child node only will be returned if the parent node is also returned. For example if you have a taxonomy like this: Node name=Pets topic type=query name=Pets query=“pet OR domesticated” Node name = Cats topic type=query name=Cats query=“cat OR feline” Node name= Dogs topic type=query name=Dogs query=“dog OR canine” A document with the text “Cats are awesome” will not be a match for the node Cats, even though the query topic “Cats” is a hit for the document. The reason this is not a match is that the parent node, Pets, has a query topic associated with it, Pets, that was not matched. If the document read “I enjoy having pets and cats are awesome” then the node Cats will be matched and returned since its parent node, Pets, also matches. Whether parent matching is enforced is controllable at the node level.
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d322","createdAt":"2015-07-07T21:30:40.378Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":100,"body":"Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app.\n\nIf you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return.\n\nSemantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk).\n[block:parameters]\n{\n  \"data\": {\n    \"0-0\": \"**Java**\",\n    \"0-1\": \"37 KB\",\n    \"0-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)\",\n    \"1-0\": \"**.NET**\",\n    \"1-1\": \"93 KB\",\n    \"1-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)\",\n    \"2-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)\",\n    \"3-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)\",\n    \"4-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)\",\n    \"5-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)\",\n    \"2-0\": \"**PHP**\",\n    \"2-1\": \"16 KB\",\n    \"3-0\": \"**Python**\",\n    \"3-1\": \"32 KB\",\n    \"4-0\": \"**Ruby**\",\n    \"4-1\": \"15 KB\",\n    \"5-0\": \"**JavaScript**\",\n    \"5-1\": \"18 KB\",\n    \"6-0\": \"**Node.js**\",\n    \"6-2\": \"[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)\",\n    \"6-1\": \"153 KB\"\n  },\n  \"cols\": 3,\n  \"rows\": 7\n}\n[/block]","excerpt":"","slug":"sdks","type":"basic","title":"SDKs","__v":0,"childrenPages":[]}

SDKs


Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app. If you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return. Semantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk). [block:parameters] { "data": { "0-0": "**Java**", "0-1": "37 KB", "0-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)", "1-0": "**.NET**", "1-1": "93 KB", "1-2": "[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)", "2-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)", "3-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)", "4-2": "[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)", "5-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)", "2-0": "**PHP**", "2-1": "16 KB", "3-0": "**Python**", "3-1": "32 KB", "4-0": "**Ruby**", "4-1": "15 KB", "5-0": "**JavaScript**", "5-1": "18 KB", "6-0": "**Node.js**", "6-2": "[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)", "6-1": "153 KB" }, "cols": 3, "rows": 7 } [/block]
Looking to integrate? Semantria's HTTP API wrappers are the most convenient way to access the Semantria API on your favorite framework. We support Java, .NET, PHP, Python, Ruby, and node.js. Our SDK libraries include Authentication, Session Manager, Serializer, Detailed analysis test app and Discovery analysis test app. If you make an SDK for a language we don't have yet, send it to us! We will happily give you data-processing credit in return. Semantria SDKs are also on [GitHub](https://github.com/Semantria/semantria-sdk). [block:parameters] { "data": { "0-0": "**Java**", "0-1": "37 KB", "0-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaSDK.tar.gz)", "1-0": "**.NET**", "1-1": "93 KB", "1-2": "[Download](http://www.semantria.com/download/SDK/SemantriaDotNetSDK.zip)", "2-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPHPSDK.tar.gz)", "3-2": "[Download](http://www.semantria.com/download/SDK/SemantriaPythonSDK.tar.gz)", "4-2": "[Download](http://www.semantria.com/download/SDK/SemantriaRubySDK.tar.gz)", "5-2": "[Download](http://www.semantria.com/download/SDK/SemantriaJavaScriptSDK.tar.gz)", "2-0": "**PHP**", "2-1": "16 KB", "3-0": "**Python**", "3-1": "32 KB", "4-0": "**Ruby**", "4-1": "15 KB", "5-0": "**JavaScript**", "5-1": "18 KB", "6-0": "**Node.js**", "6-2": "[Download](http://www.semantria.com/download/SDK/SemantriaNodejsSDK.tar.gz)", "6-1": "153 KB" }, "cols": 3, "rows": 7 } [/block]
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d323","createdAt":"2015-07-22T23:00:09.870Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":101,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Transactions\"\n}\n[/block]\nTransactions are the number of documents (tweets, articles, etc) you send to the API for analysis. Examples:\n\n  * 100 articles = 100 transactions\n  * 300 tweets = 300 transactions\nDifferent [accounts](https://semantria.com/prices) have different transaction limits. To view your number of available transactions, go to the [Dashboard](https://www.lexalytics.com/login).\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Limits\"\n}\n[/block]\nEach configuration has default limits on analysis settings that can be modified by request. To make changes to any limits, do not hesitate to email us at [support@lexalytics.com](mailto:support@lexalytics.com).\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"title\": \"Character limits\",\n  \"body\": \"Any character limits listed below refer to single byte characters.\"\n}\n[/block]\n\n[block:parameters]\n{\n  \"data\": {\n    \"0-0\": \"Number of configurations per account\",\n    \"0-1\": \"varies\",\n    \"1-0\": \"Names of blacklists, configurations, categories, SBPs, entities\",\n    \"1-1\": \"50 characters\",\n    \"2-0\": \"# of blacklisted items\",\n    \"2-1\": \"100\",\n    \"3-0\": \"# of categories\",\n    \"3-1\": \"100\",\n    \"4-0\": \"# of samples per category\",\n    \"4-1\": \"20\",\n    \"5-0\": \"# of entities\",\n    \"5-1\": \"1000\",\n    \"6-0\": \"# of queries\",\n    \"6-1\": \"100\",\n    \"7-0\": \"Query text character count\",\n    \"7-1\": \"1500 characters\",\n    \"10-0\": \"# of sentiment-bearing phrases\",\n    \"10-1\": \"1000\",\n    \"11-0\": \"Document ID character count\",\n    \"11-1\": \"36 characters\",\n    \"12-0\": \"Max document size\",\n    \"12-1\": \"2048 characters\",\n    \"13-0\": \"Max incoming/outgoing batch size (Detailed Mode)\",\n    \"13-1\": \"100 documents / batch\",\n    \"14-0\": \"Max analysis size (Discovery Mode)\",\n    \"14-1\": \"1000 documents / analysis\",\n    \"15-0\": \"Data calls (calls which POST data to Semantria)\",\n    \"15-1\": \"10 calls/second\",\n    \"16-0\": \"Settings Calls (calls which alter your configuration settings)\",\n    \"16-1\": \"10 calls/second\",\n    \"17-0\": \"Polling Calls (calls GET'ing processed data)\",\n    \"17-1\": \"10 calls/second\",\n    \"8-0\": \"NEAR count per query\",\n    \"8-1\": \"5\",\n    \"9-0\": \"NEAR operator distance\",\n    \"9-1\": \"10\"\n  },\n  \"cols\": 2,\n  \"rows\": 18\n}\n[/block]\n\tSee the API Output for Detailed and Discovery analysis limits.","excerpt":"Semantria comes with default values and limits for most features. The user can change many limits on their end-- check the Basic Mode and Discovery Mode Quick References for those-- but other limits must be changed upon request. This section details Semantria API's default limits, additional features, and license expiration.","slug":"api-limits","type":"basic","title":"API Limits","__v":0,"childrenPages":[]}

API Limits

Semantria comes with default values and limits for most features. The user can change many limits on their end-- check the Basic Mode and Discovery Mode Quick References for those-- but other limits must be changed upon request. This section details Semantria API's default limits, additional features, and license expiration.

[block:api-header] { "type": "basic", "title": "Transactions" } [/block] Transactions are the number of documents (tweets, articles, etc) you send to the API for analysis. Examples: * 100 articles = 100 transactions * 300 tweets = 300 transactions Different [accounts](https://semantria.com/prices) have different transaction limits. To view your number of available transactions, go to the [Dashboard](https://www.lexalytics.com/login). [block:api-header] { "type": "basic", "title": "Limits" } [/block] Each configuration has default limits on analysis settings that can be modified by request. To make changes to any limits, do not hesitate to email us at [support@lexalytics.com](mailto:support@lexalytics.com). [block:callout] { "type": "warning", "title": "Character limits", "body": "Any character limits listed below refer to single byte characters." } [/block] [block:parameters] { "data": { "0-0": "Number of configurations per account", "0-1": "varies", "1-0": "Names of blacklists, configurations, categories, SBPs, entities", "1-1": "50 characters", "2-0": "# of blacklisted items", "2-1": "100", "3-0": "# of categories", "3-1": "100", "4-0": "# of samples per category", "4-1": "20", "5-0": "# of entities", "5-1": "1000", "6-0": "# of queries", "6-1": "100", "7-0": "Query text character count", "7-1": "1500 characters", "10-0": "# of sentiment-bearing phrases", "10-1": "1000", "11-0": "Document ID character count", "11-1": "36 characters", "12-0": "Max document size", "12-1": "2048 characters", "13-0": "Max incoming/outgoing batch size (Detailed Mode)", "13-1": "100 documents / batch", "14-0": "Max analysis size (Discovery Mode)", "14-1": "1000 documents / analysis", "15-0": "Data calls (calls which POST data to Semantria)", "15-1": "10 calls/second", "16-0": "Settings Calls (calls which alter your configuration settings)", "16-1": "10 calls/second", "17-0": "Polling Calls (calls GET'ing processed data)", "17-1": "10 calls/second", "8-0": "NEAR count per query", "8-1": "5", "9-0": "NEAR operator distance", "9-1": "10" }, "cols": 2, "rows": 18 } [/block] See the API Output for Detailed and Discovery analysis limits.
[block:api-header] { "type": "basic", "title": "Transactions" } [/block] Transactions are the number of documents (tweets, articles, etc) you send to the API for analysis. Examples: * 100 articles = 100 transactions * 300 tweets = 300 transactions Different [accounts](https://semantria.com/prices) have different transaction limits. To view your number of available transactions, go to the [Dashboard](https://www.lexalytics.com/login). [block:api-header] { "type": "basic", "title": "Limits" } [/block] Each configuration has default limits on analysis settings that can be modified by request. To make changes to any limits, do not hesitate to email us at [support@lexalytics.com](mailto:support@lexalytics.com). [block:callout] { "type": "warning", "title": "Character limits", "body": "Any character limits listed below refer to single byte characters." } [/block] [block:parameters] { "data": { "0-0": "Number of configurations per account", "0-1": "varies", "1-0": "Names of blacklists, configurations, categories, SBPs, entities", "1-1": "50 characters", "2-0": "# of blacklisted items", "2-1": "100", "3-0": "# of categories", "3-1": "100", "4-0": "# of samples per category", "4-1": "20", "5-0": "# of entities", "5-1": "1000", "6-0": "# of queries", "6-1": "100", "7-0": "Query text character count", "7-1": "1500 characters", "10-0": "# of sentiment-bearing phrases", "10-1": "1000", "11-0": "Document ID character count", "11-1": "36 characters", "12-0": "Max document size", "12-1": "2048 characters", "13-0": "Max incoming/outgoing batch size (Detailed Mode)", "13-1": "100 documents / batch", "14-0": "Max analysis size (Discovery Mode)", "14-1": "1000 documents / analysis", "15-0": "Data calls (calls which POST data to Semantria)", "15-1": "10 calls/second", "16-0": "Settings Calls (calls which alter your configuration settings)", "16-1": "10 calls/second", "17-0": "Polling Calls (calls GET'ing processed data)", "17-1": "10 calls/second", "8-0": "NEAR count per query", "8-1": "5", "9-0": "NEAR operator distance", "9-1": "10" }, "cols": 2, "rows": 18 } [/block] See the API Output for Detailed and Discovery analysis limits.
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d324","createdAt":"2015-07-22T20:37:45.405Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":102,"body":"Semantria uses a signature algorithm for sending requests. There is a set of an API key and secret as authenticators for each license. Other web services like Amazon Web Services, XML Digital Signatures and OAuth also use these to authenticate. Authentication occurs every time the user sends a request to the Semantria API.\n  * Semantria uses a cryptographic hash function with a customer key and secret to create a signature for each request.\n  * All incoming and outgoing data passes through a secure HTTPS connection with a 256-bit SSL encryption and is encrypted at all times.\n  * Semantria protects the user from credential theft by storing a unique token generated from the real keys instead of storing the user's key and secret. The client-side application uses the authenticators to generate and pass this unique token for authorization.\n  * Any data sent to the server or processed results are removed 24 hours after it has been generated (so do not use Semantria as a server for storing data).\n  * Once a user registers and receives his keys, **Semantria does not have access to the keys anymore**. We have no way of resending the keys and no way of accessing the account using the keys; this keeps everything secure. Semantria can only verify whether keys are valid.\n  * Users can regenerate their keys via the [Semantria Dashboard](https://semantria.com/users/me/api).","excerpt":"","slug":"security-model","type":"basic","title":"Security Model","__v":0,"childrenPages":[]}

Security Model


Semantria uses a signature algorithm for sending requests. There is a set of an API key and secret as authenticators for each license. Other web services like Amazon Web Services, XML Digital Signatures and OAuth also use these to authenticate. Authentication occurs every time the user sends a request to the Semantria API. * Semantria uses a cryptographic hash function with a customer key and secret to create a signature for each request. * All incoming and outgoing data passes through a secure HTTPS connection with a 256-bit SSL encryption and is encrypted at all times. * Semantria protects the user from credential theft by storing a unique token generated from the real keys instead of storing the user's key and secret. The client-side application uses the authenticators to generate and pass this unique token for authorization. * Any data sent to the server or processed results are removed 24 hours after it has been generated (so do not use Semantria as a server for storing data). * Once a user registers and receives his keys, **Semantria does not have access to the keys anymore**. We have no way of resending the keys and no way of accessing the account using the keys; this keeps everything secure. Semantria can only verify whether keys are valid. * Users can regenerate their keys via the [Semantria Dashboard](https://semantria.com/users/me/api).
Semantria uses a signature algorithm for sending requests. There is a set of an API key and secret as authenticators for each license. Other web services like Amazon Web Services, XML Digital Signatures and OAuth also use these to authenticate. Authentication occurs every time the user sends a request to the Semantria API. * Semantria uses a cryptographic hash function with a customer key and secret to create a signature for each request. * All incoming and outgoing data passes through a secure HTTPS connection with a 256-bit SSL encryption and is encrypted at all times. * Semantria protects the user from credential theft by storing a unique token generated from the real keys instead of storing the user's key and secret. The client-side application uses the authenticators to generate and pass this unique token for authorization. * Any data sent to the server or processed results are removed 24 hours after it has been generated (so do not use Semantria as a server for storing data). * Once a user registers and receives his keys, **Semantria does not have access to the keys anymore**. We have no way of resending the keys and no way of accessing the account using the keys; this keeps everything secure. Semantria can only verify whether keys are valid. * Users can regenerate their keys via the [Semantria Dashboard](https://semantria.com/users/me/api).
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d325","createdAt":"2015-07-07T21:30:48.022Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":103,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Detailed Mode Output\"\n}\n[/block]\nDetailed Mode performs analysis on individual documents. In the Semantria API the user can customize almost every part of the analysis; from constraining the number of results for each category to defining the parts of speech which the server will detect, the user can configure Detailed Mode to suit your needs in document sentiment analysis. In this section, we provide a quick reference for customizable options and parameters for POS tagging, as well as a detailed explanation of Detailed Mode's output.\n[block:callout]\n{\n  \"type\": \"info\",\n  \"body\": \"Our fully functional [API Console](https://semantria.com/developer/api-console) offers more explanations and a chance to play with the Semantria API in a browser.\",\n  \"title\": \"API Console\"\n}\n[/block]\n\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Line-by-line Term Explanation\"\n}\n[/block]\nThis output is from analyzing the text below. However, it has been abbreviated for clarity.\n\nGoogle Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program. The company was founded by Larry Page and Sergey Brin, often dubbed the \"Google Guys\", while the two were attending Stanford University as PhD candidates.\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n#This array shows all auto categories found in the text\\n    \\\"auto_categories\\\": [\\n        {\\n#This is the relevance score for the auto category \\n            \\\"strength_score\\\": 0.51434803, \\n#This is the title of the auto category\\n            \\\"title\\\": \\\"IT\\\", \\n#This is the type of auto category - node for a category that can contain other categories, leaf for categories at the end of the tree\\n            \\\"type\\\": \\\"node\\\"\\n        }\\n    ], \\n#This is the ID of the config used to process the data\\n    \\\"config_id\\\": \\\"ed7b6405-2bc2-443d-b6c4-0feab9050c5d\\\", \\n#This array gives the detailes of the document. Each element in the array is a sentence. Only a single sentence is given here due to length.\\n    \\\"details\\\": [\\n        {\\n#If the sentence is imperative or not\\n            \\\"is_imperative\\\": false, \\n#If the sentence should carry sentiment\\n            \\\"is_polar\\\": true, \\n#This array lists all of the wordsin the sentence\\n            \\\"words\\\": [\\n                {\\n#Was the word negated via not or other negator\\n                    \\\"is_negated\\\": false, \\n#Sentiment score for the word\\n                    \\\"sentiment_score\\\": 0.0,\\n#Stemmed form of the word\\n                    \\\"stemmed\\\": \\\"google\\\",\\n#Part of speech tag. NNP is a proper noun\\n                    \\\"tag\\\": \\\"NNP\\\",\\n#Actual word\\n                    \\\"title\\\": \\\"Google\\\",\\n#Normalized part of speech tag. Proper nouns are types of nouns\\n                    \\\"type\\\": \\\"Noun\\\"\\n                },\\n#Many more words and sentences omitted\\n            ]\\n        }, \\n    ], \\n#The entities array lists all entities found.\\n    \\\"entities\\\": [\\n        {\\n #Did the entity match the optional confidence query\\n            \\\"confident\\\": true, \\n #What type of entity is it\\n            \\\"entity_type\\\": \\\"Company\\\",\\n #How much sentiment evidence is there?\\n            \\\"evidence\\\": 5, \\n #Was this entity a focus of the text?\\n            \\\"is_about\\\": true, \\n #The label. This can be overridden in user-defined entities.\\n            \\\"label\\\": \\\"Company\\\", \\n #Array of actual mentions of the entity.\\n            \\\"mentions\\\": [\\n                {\\n #Was the entity negated?\\n                    \\\"is_negated\\\": false, \\n #Actual word found in text.\\n                    \\\"label\\\": \\\"Google Inc.\\\",\\n #Locations info can be ued for hit-highlighting.\\n                    \\\"locations\\\": [\\n                        {\\n #Length of the string\\n                            \\\"length\\\": 11, \\n #Zero-based position of the actual string\\n                            \\\"offset\\\": 0\\n                        }\\n                    ]\\n                }, \\n                {\\n                    \\\"is_negated\\\": false, \\n#Note different actual string in text than the first mention\\n                    \\\"label\\\": \\\"Google\\\", \\n                    \\\"locations\\\": [\\n                        {\\n                            \\\"length\\\": 6, \\n                            \\\"offset\\\": 140\\n                        }\\n                    ]\\n                }\\n            ], \\n#Sentiment for the entity in words\\n            \\\"sentiment_polarity\\\": \\\"neutral\\\", \\n#Sentiment for the entity as a float\\n            \\\"sentiment_score\\\": -0.122652, \\n#Themes associated with the entity\\n            \\\"themes\\\": [\\n                {\\n#Amount of sentiment evidence for this theme\\n                    \\\"evidence\\\": 4,\\n#Is this theme a focus of the text?\\n                    \\\"is_about\\\": false, \\n#Array of actual mentions of the theme.\\n                    \\\"mentions\\\": [\\n                        {\\n                            \\\"is_negated\\\": true, \\n                            \\\"label\\\": \\\"public corporation\\\", \\n                            \\\"locations\\\": [\\n                                {\\n                                    \\\"length\\\": 18, \\n                                    \\\"offset\\\": 45\\n                                }\\n                            ], \\n#If an object is negated, the negating phrase\\n                            \\\"negating_phrase\\\": \\\"not\\\"\\n                        }\\n                    ], \\n#Normalized (lower-cased stemmed) version of the theme\\n                    \\\"normalized\\\": \\\"public corporate\\\",\\n#sentiment for the theme in words\\n                    \\\"sentiment_polarity\\\": \\\"neutral\\\",\\n#Sentiment for the theme in a float\\n                    \\\"sentiment_score\\\": -0.122652,\\n#Stemmed version of the theme\\n                    \\\"stemmed\\\": \\\"public corporate\\\",\\n#Relevancy of the theme to the entity\\n                    \\\"strength_score\\\": 1.0,\\n#Actual words of the theme\\n                    \\\"title\\\": \\\"public corporation\\\"\\n                }, \\n#More themes omitted\\n            ], \\n#Entity name\\n            \\\"title\\\": \\\"Google\\\",\\n#Named entities are automatically-discovered, user entities are defined\\n            \\\"type\\\": \\\"named\\\"\\n        }, \\n#More entities omitted for clarity\\n    ],\\n#The relations array lists the relationships found in the text\\n\\\"relations\\\": [\\n      {\\n#Named relations are auto-discovered\\n        \\\"type\\\": \\\"named\\\",\\n#the words triggering the relationship\\n        \\\"extra\\\": \\\"said\\\",\\n#The entities involved in the relationship\\n        \\\"entities\\\": [\\n          {\\n            \\\"title\\\": \\\"Sergey Brin\\\",\\n            \\\"entity_type\\\": \\\"Person\\\"\\n          },\\n          {\\n            \\\"title\\\": \\\"\\\\\\\"Google is marching ahead\\\\\\\"\\\",\\n            \\\"entity_type\\\": \\\"Quote\\\"\\n          }\\n        ],\\n#Type of relationship\\n        \\\"relation_type\\\": \\\"Quotation\\\",\\n        \\\"confidence_score\\\": 1\\n      }\\n    ],\\n#ID of the document \\n    \\\"id\\\": \\\"55fc6ebd-0001\\\",\\n#Language of document\\n    \\\"language\\\": \\\"English\\\", \\n#Confidence in the language\\n    \\\"language_score\\\": 0.38016528,\\n#Metadata contains metadata you passed to Semantria\\n    \\\"metadata\\\": {\\n        \\\"circulation\\\": 25, \\n        \\\"date\\\": \\\"20160325\\\"\\n    }, \\n#This dictionary lists the model-based sentiment scores\\n    \\\"model_sentiment\\\": {\\n#likelihood the document had a mixed sentiment score\\n        \\\"mixed_score\\\": 0.06166500225663185, \\n#Model name. Semantria ships with a default model.\\n        \\\"model_name\\\": \\\"default\\\", \\n#Likelihood the document had a negative score\\n        \\\"negative_score\\\": 0.09528054296970367,\\n#Likelihood the document had a neutral score\\n        \\\"neutral_score\\\": 0.6886150240898132,\\n#Likelihood the document had a neutral score\\n        \\\"positive_score\\\": 0.15443940460681915,\\n#Most likely sentiment polarity in words\\n        \\\"sentiment_polarity\\\": \\\"neutral\\\"\\n    }, \\n#This array lists all sentiment phrases found in the text.\\n    \\\"phrases\\\": { \\n        {\\n#Whether the phrase was intensified\\n            \\\"is_intensified\\\": false, \\n#Whether the phrase was negated\\n            \\\"is_negated\\\": false,\\n#length of phrase in bytes\\n      \\t\\t\\t\\\"length\\\" : 8,\\n#beginning position of phrase in bytes\\n      \\t\\t\\t\\\"offset\\\" : 362,\\n#Phrase sentiment in words\\n            \\\"sentiment_polarity\\\": \\\"negative\\\",\\n#Phrase sentiment in float\\n            \\\"sentiment_score\\\": -0.4,\\n#Actual phrase      \\n            \\\"title\\\": \\\"so wrong\\\",\\n#Whether detected or possible\\n            \\\"type\\\": \\\"detected\\\"\\n        }, \\n        {\\n            \\\"sentiment_polarity\\\": \\\"neutral\\\", \\n            \\\"title\\\": \\\"American multinational\\\",\\n#Semantria's suggestions of possible sentiment phrases to add to custom configuration\\n            \\\"type\\\": \\\"possible\\\"\\n        }, \\n#More phrases omitted for clarity.\\n    ],\\n#Sentiment of document in words\\n    \\\"sentiment_polarity\\\": \\\"neutral\\\",\\n#Sentiment of document as float\\n    \\\"sentiment_score\\\": 0.120261446,\\n#Semantria status of document.\\n    \\\"status\\\": \\\"PROCESSED\\\",\\n#Tag we submitted with document\\n\\t\\t\\\"tag\\\": \\\"Google analysis\\\",\\n#Summary of document\\n    \\\"summary\\\": \\\"Google Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies... Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program... \\\",\\n#Array of themes relevant at a document level\\n    \\\"themes\\\": [\\n        {\\n            \\\"evidence\\\": 4, \\n            \\\"is_about\\\": true, \\n            \\\"mentions\\\": [\\n                {\\n                    \\\"is_negated\\\": true, \\n                    \\\"label\\\": \\\"public corporation\\\", \\n                    \\\"locations\\\": [\\n                        {\\n                            \\\"length\\\": 18, \\n                            \\\"offset\\\": 45\\n                        }\\n                    ], \\n                    \\\"negating_phrase\\\": \\\"not\\\"\\n                }\\n            ], \\n            \\\"normalized\\\": \\\"public corporate\\\", \\n            \\\"sentiment_polarity\\\": \\\"neutral\\\", \\n            \\\"sentiment_score\\\": -0.122652, \\n            \\\"stemmed\\\": \\\"public corporate\\\", \\n            \\\"strength_score\\\": 1.0, \\n            \\\"title\\\": \\\"public corporation\\\"\\n        }\\n#More themes omitted for clarity.     \\n    ], \\n#This array lists topics discovered in text\\n    \\\"topics\\\": [\\n        {\\n#The number of query terms that hit in the document\\n            \\\"hitcount\\\": 4,\\n#The ID of the query\\n            \\\"id\\\": \\\"cb9b40e7-f663-4120-8db4-4b4f0689c63e\\\",\\n#An array listing the term hits\\n            \\\"mentions\\\": [\\n                {\\n#Whether the term was negated\\n                    \\\"is_negated\\\": false,\\n#The term that hit\\n                    \\\"label\\\": \\\"catalog\\\",\\n#An array of locations of the term\\n                    \\\"locations\\\": [\\n                        {\\n#The length in bytes of the term\\n                            \\\"length\\\": 7,\\n#The offset in bytes from the beginning of the document for the hit\\n                            \\\"offset\\\": 15\\n                        },\\n                        {\\n                            \\\"length\\\": 7,\\n                            \\\"offset\\\": 505\\n                        }\\n                    ]\\n                },\\n                {\\n                    \\\"is_negated\\\": false,\\n                    \\\"label\\\": \\\"toy\\\",\\n                    \\\"locations\\\": [\\n                        {\\n                            \\\"length\\\": 3,\\n                            \\\"offset\\\": 11\\n                        },\\n                        {\\n                            \\\"length\\\": 3,\\n                            \\\"offset\\\": 296\\n                        }\\n                    ]\\n                }\\n            ],\\n#The sentiment polarity of the query\\n            \\\"sentiment_polarity\\\": \\\"neutral\\\",\\n#The sentiment score of the query as a float\\n            \\\"sentiment_score\\\": 0.43459997,\\n#The name of the query\\n            \\\"title\\\": \\\"toys\\\",\\n#The type of query\\n            \\\"type\\\": \\\"query\\\"\\n        }\\n    ]\\n\\n        {\\n#Not used for concept topics          \\n            \\\"hitcount\\\": 0,\\n            \\\"sentiment_polarity\\\": \\\"neutral\\\", \\n            \\\"sentiment_score\\\": 0.120261446,\\n#Relevancy of topic to document          \\n            \\\"strength_score\\\": 0.55242544,\\n#Name of topic          \\n            \\\"title\\\": \\\"Advertising\\\", \\n            \\\"type\\\": \\\"concept\\\"\\n        }\\n#More topics omitted for clarity\\n    ]\\n}\\n\\n\",\n      \"language\": \"json\"\n    }\n  ]\n}\n[/block]\nDetailed mode limits apply to both document mode and source mode of analysis. All limits have integer values of 0 to 20. Setting a limit to a score of 0 signifies zero interest in the output and will prevent the result for that parameter from appearing in the dataset.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Detailed Mode output explanation\"\n}\n[/block]\nSemantria provides the user with a wealth of information in its sentiment analysis and data processing; sometimes it can be kind of hard to wade through. Here is a quick reference detailing everything the Semantria API will return to the user in Detailed Analysis Mode.\n\nEach document will have an *id* and each configuration has a unique *config_id*. The user can add *tags* and view the *status* of the document (\"queued,\" \"processed\" or \"failed\"). Semantria API will produce a *job_id* of the associated job, a *summary* of the document text, the *language* of the source text (and the *language score*, the percentage of the best language match among detected languages), and the *sentiment_score* and *sentiment_polarity*. \n\nIn detailed analysis of individual sentences, the API will return boolean values for *is_imperative* and *is_polar*. Imperative sentences, representing a action item, will be set to true. is_polar represents Semantria's guess as to whether the writer of the sentence meant to convey sentiment. For instance, \"Good morning all\" is a non-polar sentence despite containing a sentiment word of \"good.\"\n\nThe API will return a list of words grouped by the parent sentence. Each word will have a *tag*, POS *type*,* title*, *stemmed* form of the word, and *sentiment_score*.\n\nSemantria API will generate *auto_categories*; each category will have a* title*,* type* (\"node\"/root or \"leaf\"/nested value), *strength_score* (how much the category matches with document content), and *categories*, a list of sub-categories (if any exist).\n\n*phrases* are a list of sentiment-bearing phrases from the document. Each will have a *title, sentiment_score, sentiment_polarity* (negative, positive, or neutral),* is_negated* (whether the phrase has been negated), *negating_phrase* (if one exists),* is_intensified, intensifying_phrase* (if one exists), and *type* (either \"possible\" or \"detected\").\n\nThe Semantria API returns the *themes* of the document. Each has the *title*, main theme (*is_about*), the *normalized* form of the theme, the *stemmed* form of the theme, an *evidence* score, *strength_score* within the document, and *sentiment_polarity*. The API will return *mentions* of the theme: *expandable*, which is the text of the theme mention, *is_negated, negating_phrase*, and* locations*-- the list of coordinates of the mentions found within the document. *offset* is the number of bytes offset in the original text before the start of the mention, and *length* is the length of the mention in bytes.\n\nThe API returns entities with similar parameters to themes. Entities have additional parameters of *type* (either \"named\" or \"user\"),* confident* (whether the confidence queries matched for this entity), and the *entity_type* (Company, Person, Place, etc.). It will also return a list of themes related to this entity.\n\nSemantria API returns relations, which represent a connection between one or more Entities. These have a *type* (named or user value), *relation_type* (such as quotation), *confidence_score,  and extra* of the parent relationship.\n\nThe API will also return a list of opinions extracted from the source text. Each will have a *quotation, type* (the type of entity extracted-- named or user value), *speaker, topic, sentiment_score* and *sentiment_polarity*.\n\nFinally, Semantria API gives a list of topics, each with a *title, type, hitcount, strength_score, sentiment_score, sentiment_score* and* topics* (a list of sub-topics, if they exist).\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"API Options\"\n}\n[/block]\n\n[block:parameters]\n{\n  \"data\": {\n    \"h-0\": \"Option\",\n    \"h-1\": \"Description\",\n    \"h-2\": \"Default\",\n    \"0-0\": \"auto_response\",\n    \"1-0\": \"is_primary\",\n    \"2-0\": \"chars_threshold\",\n    \"3-0\": \"one_sentence\",\n    \"4-0\": \"process_html\",\n    \"5-0\": \"language\",\n    \"6-0\": \"callback\",\n    \"0-2\": \"False\",\n    \"1-2\": \"False\",\n    \"2-2\": \"80\",\n    \"3-2\": \"False\",\n    \"4-2\": \"False\",\n    \"5-2\": \"English\",\n    \"6-2\": \"Empty\",\n    \"6-1\": \"Defines a callback URL for automatic data responding (more info).\",\n    \"5-1\": \"Defines target language that will be used for task processing.\",\n    \"4-1\": \"Leads the service to clean HTML tags before processing.\",\n    \"3-1\": \"Leads the service to clean HTML tags before processing.\",\n    \"2-1\": \"Defines whether or not the service should respond with processed results on each incoming analytics document or discovery mode request.\",\n    \"1-1\": \"Identifies whether the current configuration is primary or not.\",\n    \"0-1\": \"Defines whether or not the service should respond with processed results on each incoming analytics document or discovery analysis request (more info).\"\n  },\n  \"cols\": 3,\n  \"rows\": 7\n}\n[/block]","excerpt":"","slug":"output","type":"basic","title":"Detailed API Output Explanation","__v":0,"childrenPages":[]}

Detailed API Output Explanation


[block:api-header] { "type": "basic", "title": "Detailed Mode Output" } [/block] Detailed Mode performs analysis on individual documents. In the Semantria API the user can customize almost every part of the analysis; from constraining the number of results for each category to defining the parts of speech which the server will detect, the user can configure Detailed Mode to suit your needs in document sentiment analysis. In this section, we provide a quick reference for customizable options and parameters for POS tagging, as well as a detailed explanation of Detailed Mode's output. [block:callout] { "type": "info", "body": "Our fully functional [API Console](https://semantria.com/developer/api-console) offers more explanations and a chance to play with the Semantria API in a browser.", "title": "API Console" } [/block] [block:api-header] { "type": "basic", "title": "Line-by-line Term Explanation" } [/block] This output is from analyzing the text below. However, it has been abbreviated for clarity. Google Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program. The company was founded by Larry Page and Sergey Brin, often dubbed the "Google Guys", while the two were attending Stanford University as PhD candidates. [block:code] { "codes": [ { "code": "{\n#This array shows all auto categories found in the text\n \"auto_categories\": [\n {\n#This is the relevance score for the auto category \n \"strength_score\": 0.51434803, \n#This is the title of the auto category\n \"title\": \"IT\", \n#This is the type of auto category - node for a category that can contain other categories, leaf for categories at the end of the tree\n \"type\": \"node\"\n }\n ], \n#This is the ID of the config used to process the data\n \"config_id\": \"ed7b6405-2bc2-443d-b6c4-0feab9050c5d\", \n#This array gives the detailes of the document. Each element in the array is a sentence. Only a single sentence is given here due to length.\n \"details\": [\n {\n#If the sentence is imperative or not\n \"is_imperative\": false, \n#If the sentence should carry sentiment\n \"is_polar\": true, \n#This array lists all of the wordsin the sentence\n \"words\": [\n {\n#Was the word negated via not or other negator\n \"is_negated\": false, \n#Sentiment score for the word\n \"sentiment_score\": 0.0,\n#Stemmed form of the word\n \"stemmed\": \"google\",\n#Part of speech tag. NNP is a proper noun\n \"tag\": \"NNP\",\n#Actual word\n \"title\": \"Google\",\n#Normalized part of speech tag. Proper nouns are types of nouns\n \"type\": \"Noun\"\n },\n#Many more words and sentences omitted\n ]\n }, \n ], \n#The entities array lists all entities found.\n \"entities\": [\n {\n #Did the entity match the optional confidence query\n \"confident\": true, \n #What type of entity is it\n \"entity_type\": \"Company\",\n #How much sentiment evidence is there?\n \"evidence\": 5, \n #Was this entity a focus of the text?\n \"is_about\": true, \n #The label. This can be overridden in user-defined entities.\n \"label\": \"Company\", \n #Array of actual mentions of the entity.\n \"mentions\": [\n {\n #Was the entity negated?\n \"is_negated\": false, \n #Actual word found in text.\n \"label\": \"Google Inc.\",\n #Locations info can be ued for hit-highlighting.\n \"locations\": [\n {\n #Length of the string\n \"length\": 11, \n #Zero-based position of the actual string\n \"offset\": 0\n }\n ]\n }, \n {\n \"is_negated\": false, \n#Note different actual string in text than the first mention\n \"label\": \"Google\", \n \"locations\": [\n {\n \"length\": 6, \n \"offset\": 140\n }\n ]\n }\n ], \n#Sentiment for the entity in words\n \"sentiment_polarity\": \"neutral\", \n#Sentiment for the entity as a float\n \"sentiment_score\": -0.122652, \n#Themes associated with the entity\n \"themes\": [\n {\n#Amount of sentiment evidence for this theme\n \"evidence\": 4,\n#Is this theme a focus of the text?\n \"is_about\": false, \n#Array of actual mentions of the theme.\n \"mentions\": [\n {\n \"is_negated\": true, \n \"label\": \"public corporation\", \n \"locations\": [\n {\n \"length\": 18, \n \"offset\": 45\n }\n ], \n#If an object is negated, the negating phrase\n \"negating_phrase\": \"not\"\n }\n ], \n#Normalized (lower-cased stemmed) version of the theme\n \"normalized\": \"public corporate\",\n#sentiment for the theme in words\n \"sentiment_polarity\": \"neutral\",\n#Sentiment for the theme in a float\n \"sentiment_score\": -0.122652,\n#Stemmed version of the theme\n \"stemmed\": \"public corporate\",\n#Relevancy of the theme to the entity\n \"strength_score\": 1.0,\n#Actual words of the theme\n \"title\": \"public corporation\"\n }, \n#More themes omitted\n ], \n#Entity name\n \"title\": \"Google\",\n#Named entities are automatically-discovered, user entities are defined\n \"type\": \"named\"\n }, \n#More entities omitted for clarity\n ],\n#The relations array lists the relationships found in the text\n\"relations\": [\n {\n#Named relations are auto-discovered\n \"type\": \"named\",\n#the words triggering the relationship\n \"extra\": \"said\",\n#The entities involved in the relationship\n \"entities\": [\n {\n \"title\": \"Sergey Brin\",\n \"entity_type\": \"Person\"\n },\n {\n \"title\": \"\\\"Google is marching ahead\\\"\",\n \"entity_type\": \"Quote\"\n }\n ],\n#Type of relationship\n \"relation_type\": \"Quotation\",\n \"confidence_score\": 1\n }\n ],\n#ID of the document \n \"id\": \"55fc6ebd-0001\",\n#Language of document\n \"language\": \"English\", \n#Confidence in the language\n \"language_score\": 0.38016528,\n#Metadata contains metadata you passed to Semantria\n \"metadata\": {\n \"circulation\": 25, \n \"date\": \"20160325\"\n }, \n#This dictionary lists the model-based sentiment scores\n \"model_sentiment\": {\n#likelihood the document had a mixed sentiment score\n \"mixed_score\": 0.06166500225663185, \n#Model name. Semantria ships with a default model.\n \"model_name\": \"default\", \n#Likelihood the document had a negative score\n \"negative_score\": 0.09528054296970367,\n#Likelihood the document had a neutral score\n \"neutral_score\": 0.6886150240898132,\n#Likelihood the document had a neutral score\n \"positive_score\": 0.15443940460681915,\n#Most likely sentiment polarity in words\n \"sentiment_polarity\": \"neutral\"\n }, \n#This array lists all sentiment phrases found in the text.\n \"phrases\": { \n {\n#Whether the phrase was intensified\n \"is_intensified\": false, \n#Whether the phrase was negated\n \"is_negated\": false,\n#length of phrase in bytes\n \t\t\t\"length\" : 8,\n#beginning position of phrase in bytes\n \t\t\t\"offset\" : 362,\n#Phrase sentiment in words\n \"sentiment_polarity\": \"negative\",\n#Phrase sentiment in float\n \"sentiment_score\": -0.4,\n#Actual phrase \n \"title\": \"so wrong\",\n#Whether detected or possible\n \"type\": \"detected\"\n }, \n {\n \"sentiment_polarity\": \"neutral\", \n \"title\": \"American multinational\",\n#Semantria's suggestions of possible sentiment phrases to add to custom configuration\n \"type\": \"possible\"\n }, \n#More phrases omitted for clarity.\n ],\n#Sentiment of document in words\n \"sentiment_polarity\": \"neutral\",\n#Sentiment of document as float\n \"sentiment_score\": 0.120261446,\n#Semantria status of document.\n \"status\": \"PROCESSED\",\n#Tag we submitted with document\n\t\t\"tag\": \"Google analysis\",\n#Summary of document\n \"summary\": \"Google Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies... Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program... \",\n#Array of themes relevant at a document level\n \"themes\": [\n {\n \"evidence\": 4, \n \"is_about\": true, \n \"mentions\": [\n {\n \"is_negated\": true, \n \"label\": \"public corporation\", \n \"locations\": [\n {\n \"length\": 18, \n \"offset\": 45\n }\n ], \n \"negating_phrase\": \"not\"\n }\n ], \n \"normalized\": \"public corporate\", \n \"sentiment_polarity\": \"neutral\", \n \"sentiment_score\": -0.122652, \n \"stemmed\": \"public corporate\", \n \"strength_score\": 1.0, \n \"title\": \"public corporation\"\n }\n#More themes omitted for clarity. \n ], \n#This array lists topics discovered in text\n \"topics\": [\n {\n#The number of query terms that hit in the document\n \"hitcount\": 4,\n#The ID of the query\n \"id\": \"cb9b40e7-f663-4120-8db4-4b4f0689c63e\",\n#An array listing the term hits\n \"mentions\": [\n {\n#Whether the term was negated\n \"is_negated\": false,\n#The term that hit\n \"label\": \"catalog\",\n#An array of locations of the term\n \"locations\": [\n {\n#The length in bytes of the term\n \"length\": 7,\n#The offset in bytes from the beginning of the document for the hit\n \"offset\": 15\n },\n {\n \"length\": 7,\n \"offset\": 505\n }\n ]\n },\n {\n \"is_negated\": false,\n \"label\": \"toy\",\n \"locations\": [\n {\n \"length\": 3,\n \"offset\": 11\n },\n {\n \"length\": 3,\n \"offset\": 296\n }\n ]\n }\n ],\n#The sentiment polarity of the query\n \"sentiment_polarity\": \"neutral\",\n#The sentiment score of the query as a float\n \"sentiment_score\": 0.43459997,\n#The name of the query\n \"title\": \"toys\",\n#The type of query\n \"type\": \"query\"\n }\n ]\n\n {\n#Not used for concept topics \n \"hitcount\": 0,\n \"sentiment_polarity\": \"neutral\", \n \"sentiment_score\": 0.120261446,\n#Relevancy of topic to document \n \"strength_score\": 0.55242544,\n#Name of topic \n \"title\": \"Advertising\", \n \"type\": \"concept\"\n }\n#More topics omitted for clarity\n ]\n}\n\n", "language": "json" } ] } [/block] Detailed mode limits apply to both document mode and source mode of analysis. All limits have integer values of 0 to 20. Setting a limit to a score of 0 signifies zero interest in the output and will prevent the result for that parameter from appearing in the dataset. [block:api-header] { "type": "basic", "title": "Detailed Mode output explanation" } [/block] Semantria provides the user with a wealth of information in its sentiment analysis and data processing; sometimes it can be kind of hard to wade through. Here is a quick reference detailing everything the Semantria API will return to the user in Detailed Analysis Mode. Each document will have an *id* and each configuration has a unique *config_id*. The user can add *tags* and view the *status* of the document ("queued," "processed" or "failed"). Semantria API will produce a *job_id* of the associated job, a *summary* of the document text, the *language* of the source text (and the *language score*, the percentage of the best language match among detected languages), and the *sentiment_score* and *sentiment_polarity*. In detailed analysis of individual sentences, the API will return boolean values for *is_imperative* and *is_polar*. Imperative sentences, representing a action item, will be set to true. is_polar represents Semantria's guess as to whether the writer of the sentence meant to convey sentiment. For instance, "Good morning all" is a non-polar sentence despite containing a sentiment word of "good." The API will return a list of words grouped by the parent sentence. Each word will have a *tag*, POS *type*,* title*, *stemmed* form of the word, and *sentiment_score*. Semantria API will generate *auto_categories*; each category will have a* title*,* type* ("node"/root or "leaf"/nested value), *strength_score* (how much the category matches with document content), and *categories*, a list of sub-categories (if any exist). *phrases* are a list of sentiment-bearing phrases from the document. Each will have a *title, sentiment_score, sentiment_polarity* (negative, positive, or neutral),* is_negated* (whether the phrase has been negated), *negating_phrase* (if one exists),* is_intensified, intensifying_phrase* (if one exists), and *type* (either "possible" or "detected"). The Semantria API returns the *themes* of the document. Each has the *title*, main theme (*is_about*), the *normalized* form of the theme, the *stemmed* form of the theme, an *evidence* score, *strength_score* within the document, and *sentiment_polarity*. The API will return *mentions* of the theme: *expandable*, which is the text of the theme mention, *is_negated, negating_phrase*, and* locations*-- the list of coordinates of the mentions found within the document. *offset* is the number of bytes offset in the original text before the start of the mention, and *length* is the length of the mention in bytes. The API returns entities with similar parameters to themes. Entities have additional parameters of *type* (either "named" or "user"),* confident* (whether the confidence queries matched for this entity), and the *entity_type* (Company, Person, Place, etc.). It will also return a list of themes related to this entity. Semantria API returns relations, which represent a connection between one or more Entities. These have a *type* (named or user value), *relation_type* (such as quotation), *confidence_score, and extra* of the parent relationship. The API will also return a list of opinions extracted from the source text. Each will have a *quotation, type* (the type of entity extracted-- named or user value), *speaker, topic, sentiment_score* and *sentiment_polarity*. Finally, Semantria API gives a list of topics, each with a *title, type, hitcount, strength_score, sentiment_score, sentiment_score* and* topics* (a list of sub-topics, if they exist). [block:api-header] { "type": "basic", "title": "API Options" } [/block] [block:parameters] { "data": { "h-0": "Option", "h-1": "Description", "h-2": "Default", "0-0": "auto_response", "1-0": "is_primary", "2-0": "chars_threshold", "3-0": "one_sentence", "4-0": "process_html", "5-0": "language", "6-0": "callback", "0-2": "False", "1-2": "False", "2-2": "80", "3-2": "False", "4-2": "False", "5-2": "English", "6-2": "Empty", "6-1": "Defines a callback URL for automatic data responding (more info).", "5-1": "Defines target language that will be used for task processing.", "4-1": "Leads the service to clean HTML tags before processing.", "3-1": "Leads the service to clean HTML tags before processing.", "2-1": "Defines whether or not the service should respond with processed results on each incoming analytics document or discovery mode request.", "1-1": "Identifies whether the current configuration is primary or not.", "0-1": "Defines whether or not the service should respond with processed results on each incoming analytics document or discovery analysis request (more info)." }, "cols": 3, "rows": 7 } [/block]
[block:api-header] { "type": "basic", "title": "Detailed Mode Output" } [/block] Detailed Mode performs analysis on individual documents. In the Semantria API the user can customize almost every part of the analysis; from constraining the number of results for each category to defining the parts of speech which the server will detect, the user can configure Detailed Mode to suit your needs in document sentiment analysis. In this section, we provide a quick reference for customizable options and parameters for POS tagging, as well as a detailed explanation of Detailed Mode's output. [block:callout] { "type": "info", "body": "Our fully functional [API Console](https://semantria.com/developer/api-console) offers more explanations and a chance to play with the Semantria API in a browser.", "title": "API Console" } [/block] [block:api-header] { "type": "basic", "title": "Line-by-line Term Explanation" } [/block] This output is from analyzing the text below. However, it has been abbreviated for clarity. Google Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies. Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program. The company was founded by Larry Page and Sergey Brin, often dubbed the "Google Guys", while the two were attending Stanford University as PhD candidates. [block:code] { "codes": [ { "code": "{\n#This array shows all auto categories found in the text\n \"auto_categories\": [\n {\n#This is the relevance score for the auto category \n \"strength_score\": 0.51434803, \n#This is the title of the auto category\n \"title\": \"IT\", \n#This is the type of auto category - node for a category that can contain other categories, leaf for categories at the end of the tree\n \"type\": \"node\"\n }\n ], \n#This is the ID of the config used to process the data\n \"config_id\": \"ed7b6405-2bc2-443d-b6c4-0feab9050c5d\", \n#This array gives the detailes of the document. Each element in the array is a sentence. Only a single sentence is given here due to length.\n \"details\": [\n {\n#If the sentence is imperative or not\n \"is_imperative\": false, \n#If the sentence should carry sentiment\n \"is_polar\": true, \n#This array lists all of the wordsin the sentence\n \"words\": [\n {\n#Was the word negated via not or other negator\n \"is_negated\": false, \n#Sentiment score for the word\n \"sentiment_score\": 0.0,\n#Stemmed form of the word\n \"stemmed\": \"google\",\n#Part of speech tag. NNP is a proper noun\n \"tag\": \"NNP\",\n#Actual word\n \"title\": \"Google\",\n#Normalized part of speech tag. Proper nouns are types of nouns\n \"type\": \"Noun\"\n },\n#Many more words and sentences omitted\n ]\n }, \n ], \n#The entities array lists all entities found.\n \"entities\": [\n {\n #Did the entity match the optional confidence query\n \"confident\": true, \n #What type of entity is it\n \"entity_type\": \"Company\",\n #How much sentiment evidence is there?\n \"evidence\": 5, \n #Was this entity a focus of the text?\n \"is_about\": true, \n #The label. This can be overridden in user-defined entities.\n \"label\": \"Company\", \n #Array of actual mentions of the entity.\n \"mentions\": [\n {\n #Was the entity negated?\n \"is_negated\": false, \n #Actual word found in text.\n \"label\": \"Google Inc.\",\n #Locations info can be ued for hit-highlighting.\n \"locations\": [\n {\n #Length of the string\n \"length\": 11, \n #Zero-based position of the actual string\n \"offset\": 0\n }\n ]\n }, \n {\n \"is_negated\": false, \n#Note different actual string in text than the first mention\n \"label\": \"Google\", \n \"locations\": [\n {\n \"length\": 6, \n \"offset\": 140\n }\n ]\n }\n ], \n#Sentiment for the entity in words\n \"sentiment_polarity\": \"neutral\", \n#Sentiment for the entity as a float\n \"sentiment_score\": -0.122652, \n#Themes associated with the entity\n \"themes\": [\n {\n#Amount of sentiment evidence for this theme\n \"evidence\": 4,\n#Is this theme a focus of the text?\n \"is_about\": false, \n#Array of actual mentions of the theme.\n \"mentions\": [\n {\n \"is_negated\": true, \n \"label\": \"public corporation\", \n \"locations\": [\n {\n \"length\": 18, \n \"offset\": 45\n }\n ], \n#If an object is negated, the negating phrase\n \"negating_phrase\": \"not\"\n }\n ], \n#Normalized (lower-cased stemmed) version of the theme\n \"normalized\": \"public corporate\",\n#sentiment for the theme in words\n \"sentiment_polarity\": \"neutral\",\n#Sentiment for the theme in a float\n \"sentiment_score\": -0.122652,\n#Stemmed version of the theme\n \"stemmed\": \"public corporate\",\n#Relevancy of the theme to the entity\n \"strength_score\": 1.0,\n#Actual words of the theme\n \"title\": \"public corporation\"\n }, \n#More themes omitted\n ], \n#Entity name\n \"title\": \"Google\",\n#Named entities are automatically-discovered, user entities are defined\n \"type\": \"named\"\n }, \n#More entities omitted for clarity\n ],\n#The relations array lists the relationships found in the text\n\"relations\": [\n {\n#Named relations are auto-discovered\n \"type\": \"named\",\n#the words triggering the relationship\n \"extra\": \"said\",\n#The entities involved in the relationship\n \"entities\": [\n {\n \"title\": \"Sergey Brin\",\n \"entity_type\": \"Person\"\n },\n {\n \"title\": \"\\\"Google is marching ahead\\\"\",\n \"entity_type\": \"Quote\"\n }\n ],\n#Type of relationship\n \"relation_type\": \"Quotation\",\n \"confidence_score\": 1\n }\n ],\n#ID of the document \n \"id\": \"55fc6ebd-0001\",\n#Language of document\n \"language\": \"English\", \n#Confidence in the language\n \"language_score\": 0.38016528,\n#Metadata contains metadata you passed to Semantria\n \"metadata\": {\n \"circulation\": 25, \n \"date\": \"20160325\"\n }, \n#This dictionary lists the model-based sentiment scores\n \"model_sentiment\": {\n#likelihood the document had a mixed sentiment score\n \"mixed_score\": 0.06166500225663185, \n#Model name. Semantria ships with a default model.\n \"model_name\": \"default\", \n#Likelihood the document had a negative score\n \"negative_score\": 0.09528054296970367,\n#Likelihood the document had a neutral score\n \"neutral_score\": 0.6886150240898132,\n#Likelihood the document had a neutral score\n \"positive_score\": 0.15443940460681915,\n#Most likely sentiment polarity in words\n \"sentiment_polarity\": \"neutral\"\n }, \n#This array lists all sentiment phrases found in the text.\n \"phrases\": { \n {\n#Whether the phrase was intensified\n \"is_intensified\": false, \n#Whether the phrase was negated\n \"is_negated\": false,\n#length of phrase in bytes\n \t\t\t\"length\" : 8,\n#beginning position of phrase in bytes\n \t\t\t\"offset\" : 362,\n#Phrase sentiment in words\n \"sentiment_polarity\": \"negative\",\n#Phrase sentiment in float\n \"sentiment_score\": -0.4,\n#Actual phrase \n \"title\": \"so wrong\",\n#Whether detected or possible\n \"type\": \"detected\"\n }, \n {\n \"sentiment_polarity\": \"neutral\", \n \"title\": \"American multinational\",\n#Semantria's suggestions of possible sentiment phrases to add to custom configuration\n \"type\": \"possible\"\n }, \n#More phrases omitted for clarity.\n ],\n#Sentiment of document in words\n \"sentiment_polarity\": \"neutral\",\n#Sentiment of document as float\n \"sentiment_score\": 0.120261446,\n#Semantria status of document.\n \"status\": \"PROCESSED\",\n#Tag we submitted with document\n\t\t\"tag\": \"Google analysis\",\n#Summary of document\n \"summary\": \"Google Inc. is an American multinational not public corporation invested in Internet search, cloud computing, and advertising technologies... Google hosts and develops a number of Internet-based services and products, and generates profit primarily from advertising through its AdWords program... \",\n#Array of themes relevant at a document level\n \"themes\": [\n {\n \"evidence\": 4, \n \"is_about\": true, \n \"mentions\": [\n {\n \"is_negated\": true, \n \"label\": \"public corporation\", \n \"locations\": [\n {\n \"length\": 18, \n \"offset\": 45\n }\n ], \n \"negating_phrase\": \"not\"\n }\n ], \n \"normalized\": \"public corporate\", \n \"sentiment_polarity\": \"neutral\", \n \"sentiment_score\": -0.122652, \n \"stemmed\": \"public corporate\", \n \"strength_score\": 1.0, \n \"title\": \"public corporation\"\n }\n#More themes omitted for clarity. \n ], \n#This array lists topics discovered in text\n \"topics\": [\n {\n#The number of query terms that hit in the document\n \"hitcount\": 4,\n#The ID of the query\n \"id\": \"cb9b40e7-f663-4120-8db4-4b4f0689c63e\",\n#An array listing the term hits\n \"mentions\": [\n {\n#Whether the term was negated\n \"is_negated\": false,\n#The term that hit\n \"label\": \"catalog\",\n#An array of locations of the term\n \"locations\": [\n {\n#The length in bytes of the term\n \"length\": 7,\n#The offset in bytes from the beginning of the document for the hit\n \"offset\": 15\n },\n {\n \"length\": 7,\n \"offset\": 505\n }\n ]\n },\n {\n \"is_negated\": false,\n \"label\": \"toy\",\n \"locations\": [\n {\n \"length\": 3,\n \"offset\": 11\n },\n {\n \"length\": 3,\n \"offset\": 296\n }\n ]\n }\n ],\n#The sentiment polarity of the query\n \"sentiment_polarity\": \"neutral\",\n#The sentiment score of the query as a float\n \"sentiment_score\": 0.43459997,\n#The name of the query\n \"title\": \"toys\",\n#The type of query\n \"type\": \"query\"\n }\n ]\n\n {\n#Not used for concept topics \n \"hitcount\": 0,\n \"sentiment_polarity\": \"neutral\", \n \"sentiment_score\": 0.120261446,\n#Relevancy of topic to document \n \"strength_score\": 0.55242544,\n#Name of topic \n \"title\": \"Advertising\", \n \"type\": \"concept\"\n }\n#More topics omitted for clarity\n ]\n}\n\n", "language": "json" } ] } [/block] Detailed mode limits apply to both document mode and source mode of analysis. All limits have integer values of 0 to 20. Setting a limit to a score of 0 signifies zero interest in the output and will prevent the result for that parameter from appearing in the dataset. [block:api-header] { "type": "basic", "title": "Detailed Mode output explanation" } [/block] Semantria provides the user with a wealth of information in its sentiment analysis and data processing; sometimes it can be kind of hard to wade through. Here is a quick reference detailing everything the Semantria API will return to the user in Detailed Analysis Mode. Each document will have an *id* and each configuration has a unique *config_id*. The user can add *tags* and view the *status* of the document ("queued," "processed" or "failed"). Semantria API will produce a *job_id* of the associated job, a *summary* of the document text, the *language* of the source text (and the *language score*, the percentage of the best language match among detected languages), and the *sentiment_score* and *sentiment_polarity*. In detailed analysis of individual sentences, the API will return boolean values for *is_imperative* and *is_polar*. Imperative sentences, representing a action item, will be set to true. is_polar represents Semantria's guess as to whether the writer of the sentence meant to convey sentiment. For instance, "Good morning all" is a non-polar sentence despite containing a sentiment word of "good." The API will return a list of words grouped by the parent sentence. Each word will have a *tag*, POS *type*,* title*, *stemmed* form of the word, and *sentiment_score*. Semantria API will generate *auto_categories*; each category will have a* title*,* type* ("node"/root or "leaf"/nested value), *strength_score* (how much the category matches with document content), and *categories*, a list of sub-categories (if any exist). *phrases* are a list of sentiment-bearing phrases from the document. Each will have a *title, sentiment_score, sentiment_polarity* (negative, positive, or neutral),* is_negated* (whether the phrase has been negated), *negating_phrase* (if one exists),* is_intensified, intensifying_phrase* (if one exists), and *type* (either "possible" or "detected"). The Semantria API returns the *themes* of the document. Each has the *title*, main theme (*is_about*), the *normalized* form of the theme, the *stemmed* form of the theme, an *evidence* score, *strength_score* within the document, and *sentiment_polarity*. The API will return *mentions* of the theme: *expandable*, which is the text of the theme mention, *is_negated, negating_phrase*, and* locations*-- the list of coordinates of the mentions found within the document. *offset* is the number of bytes offset in the original text before the start of the mention, and *length* is the length of the mention in bytes. The API returns entities with similar parameters to themes. Entities have additional parameters of *type* (either "named" or "user"),* confident* (whether the confidence queries matched for this entity), and the *entity_type* (Company, Person, Place, etc.). It will also return a list of themes related to this entity. Semantria API returns relations, which represent a connection between one or more Entities. These have a *type* (named or user value), *relation_type* (such as quotation), *confidence_score, and extra* of the parent relationship. The API will also return a list of opinions extracted from the source text. Each will have a *quotation, type* (the type of entity extracted-- named or user value), *speaker, topic, sentiment_score* and *sentiment_polarity*. Finally, Semantria API gives a list of topics, each with a *title, type, hitcount, strength_score, sentiment_score, sentiment_score* and* topics* (a list of sub-topics, if they exist). [block:api-header] { "type": "basic", "title": "API Options" } [/block] [block:parameters] { "data": { "h-0": "Option", "h-1": "Description", "h-2": "Default", "0-0": "auto_response", "1-0": "is_primary", "2-0": "chars_threshold", "3-0": "one_sentence", "4-0": "process_html", "5-0": "language", "6-0": "callback", "0-2": "False", "1-2": "False", "2-2": "80", "3-2": "False", "4-2": "False", "5-2": "English", "6-2": "Empty", "6-1": "Defines a callback URL for automatic data responding (more info).", "5-1": "Defines target language that will be used for task processing.", "4-1": "Leads the service to clean HTML tags before processing.", "3-1": "Leads the service to clean HTML tags before processing.", "2-1": "Defines whether or not the service should respond with processed results on each incoming analytics document or discovery mode request.", "1-1": "Identifies whether the current configuration is primary or not.", "0-1": "Defines whether or not the service should respond with processed results on each incoming analytics document or discovery analysis request (more info)." }, "cols": 3, "rows": 7 } [/block]
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"55a6a2fb51457325000e4e3d","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d326","createdAt":"2015-09-18T13:00:50.549Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":104,"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Discovery Mode Output\"\n}\n[/block]\nSemantria’s Discovery Mode provides you with a bird's eye view of your content after the sentiment has been analyzed. In this mode you can discover the top entities, themes, facets, topics, and overarching sentiment count of your group of documents. Semantria does the counting and rollup for you. This can be a good way to get on overview of what is in your data without having to put it into another tool to aggregate the output.\n\n\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Discovery Mode output explanation\"\n}\n[/block]\nIn Discovery mode, sentiment average is not supported. Instead, the positive, negative, and neutral counts available for facets are analogous to the positive/negative ratio; these show how many mentions of each type were in the text with respect to a certain facet.\n\nLike in Detailed Mode, Discovery Analysis Output will have an analysis id, config_id, job_id, tag and status. The analysis will also return themes, entities and topics.\n\nDiscovery Mode Analysis returns the facets extracted from all documents in the batch of discovery analysis. Each facet has a label (the text of the facet), count (number of occurrences), negative_count (number of negative occurrences), netural_count, positive_count, and mentions.Each mention will have a label, an indicator, and negating_phrase. The API will also return attributes associated with the facet with accompanying labels, counts and mentions.\n\nUsers have the option for the Semantria API to return the original source document in addition to the processed results. This is useful for multi-level integrations. The option can be switched on and off on upon request.\n\nThis analysis can give you insight into problem areas within your business. With Discovery Mode you can see which of your hotel branches is underperforming or which competing brand is generating the most buzz on Twitter. Additionally, you can see the reasons behind the positive and negative feedback and quickly use that information to improve. You can also use your Discovery data to create charts and tables so that you can understand your data from a quick glance.\n\nWhen sending a collection of texts (e.g. a set of 1,000 tweets) to Semantria, all content is analyzed simultaneously. All recurring mentions of an entity or theme are counted and available to you. In Discovery Mode our Excel Add-In will only display facets, attributes, and sentiment count due to Excel limitations, but Semantria is doing more work behind the curtain.\n\nFor example, when running Discovery Mode on a collection that contains the sentence “My waiter was rude!” Semantria will identify the word “waiter” as a facet and search for it throughout the rest of the texts. The attributes associated with “waiter” found within the collection are then consolidated so you will know how many people share the same feelings towards the waiter.","excerpt":"","slug":"discovery-output-explanation","type":"basic","title":"Discovery Output Explanation","__v":0,"childrenPages":[]}

Discovery Output Explanation


[block:api-header] { "type": "basic", "title": "Discovery Mode Output" } [/block] Semantria’s Discovery Mode provides you with a bird's eye view of your content after the sentiment has been analyzed. In this mode you can discover the top entities, themes, facets, topics, and overarching sentiment count of your group of documents. Semantria does the counting and rollup for you. This can be a good way to get on overview of what is in your data without having to put it into another tool to aggregate the output. [block:api-header] { "type": "basic", "title": "Discovery Mode output explanation" } [/block] In Discovery mode, sentiment average is not supported. Instead, the positive, negative, and neutral counts available for facets are analogous to the positive/negative ratio; these show how many mentions of each type were in the text with respect to a certain facet. Like in Detailed Mode, Discovery Analysis Output will have an analysis id, config_id, job_id, tag and status. The analysis will also return themes, entities and topics. Discovery Mode Analysis returns the facets extracted from all documents in the batch of discovery analysis. Each facet has a label (the text of the facet), count (number of occurrences), negative_count (number of negative occurrences), netural_count, positive_count, and mentions.Each mention will have a label, an indicator, and negating_phrase. The API will also return attributes associated with the facet with accompanying labels, counts and mentions. Users have the option for the Semantria API to return the original source document in addition to the processed results. This is useful for multi-level integrations. The option can be switched on and off on upon request. This analysis can give you insight into problem areas within your business. With Discovery Mode you can see which of your hotel branches is underperforming or which competing brand is generating the most buzz on Twitter. Additionally, you can see the reasons behind the positive and negative feedback and quickly use that information to improve. You can also use your Discovery data to create charts and tables so that you can understand your data from a quick glance. When sending a collection of texts (e.g. a set of 1,000 tweets) to Semantria, all content is analyzed simultaneously. All recurring mentions of an entity or theme are counted and available to you. In Discovery Mode our Excel Add-In will only display facets, attributes, and sentiment count due to Excel limitations, but Semantria is doing more work behind the curtain. For example, when running Discovery Mode on a collection that contains the sentence “My waiter was rude!” Semantria will identify the word “waiter” as a facet and search for it throughout the rest of the texts. The attributes associated with “waiter” found within the collection are then consolidated so you will know how many people share the same feelings towards the waiter.
[block:api-header] { "type": "basic", "title": "Discovery Mode Output" } [/block] Semantria’s Discovery Mode provides you with a bird's eye view of your content after the sentiment has been analyzed. In this mode you can discover the top entities, themes, facets, topics, and overarching sentiment count of your group of documents. Semantria does the counting and rollup for you. This can be a good way to get on overview of what is in your data without having to put it into another tool to aggregate the output. [block:api-header] { "type": "basic", "title": "Discovery Mode output explanation" } [/block] In Discovery mode, sentiment average is not supported. Instead, the positive, negative, and neutral counts available for facets are analogous to the positive/negative ratio; these show how many mentions of each type were in the text with respect to a certain facet. Like in Detailed Mode, Discovery Analysis Output will have an analysis id, config_id, job_id, tag and status. The analysis will also return themes, entities and topics. Discovery Mode Analysis returns the facets extracted from all documents in the batch of discovery analysis. Each facet has a label (the text of the facet), count (number of occurrences), negative_count (number of negative occurrences), netural_count, positive_count, and mentions.Each mention will have a label, an indicator, and negating_phrase. The API will also return attributes associated with the facet with accompanying labels, counts and mentions. Users have the option for the Semantria API to return the original source document in addition to the processed results. This is useful for multi-level integrations. The option can be switched on and off on upon request. This analysis can give you insight into problem areas within your business. With Discovery Mode you can see which of your hotel branches is underperforming or which competing brand is generating the most buzz on Twitter. Additionally, you can see the reasons behind the positive and negative feedback and quickly use that information to improve. You can also use your Discovery data to create charts and tables so that you can understand your data from a quick glance. When sending a collection of texts (e.g. a set of 1,000 tweets) to Semantria, all content is analyzed simultaneously. All recurring mentions of an entity or theme are counted and available to you. In Discovery Mode our Excel Add-In will only display facets, attributes, and sentiment count due to Excel limitations, but Semantria is doing more work behind the curtain. For example, when running Discovery Mode on a collection that contains the sentence “My waiter was rude!” Semantria will identify the word “waiter” as a facet and search for it throughout the rest of the texts. The attributes associated with “waiter” found within the collection are then consolidated so you will know how many people share the same feelings towards the waiter.
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d327","createdAt":"2015-07-07T21:30:57.379Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":105,"body":"200\tServer request correct and accepted. The server responds with data according to the document sent and the configuration of the auto response feature.\n\n202\tServer request correct and accepted. Server doesn’t respond with any data and just serves the request.\n\n400\tWrong request format. Server responds with details.\n\n401\tAuthentication failed.\n\n402\tRequest is unauthorized. The number of calls limit has been reached or the license is expired.\n\n403 Request is forbidden. Server responds with details.\n\n404\tNo documents or collections with the provided unique configuration ID were found on the server.\n\n406\tBatch, collection or other configuration limits reached. Server responds with details.\n\n413\tCharacter limit for single document exceeded.\n\n500\tServer side issue. Server may respond with the details in response body.","excerpt":"You can see detailed examples under the endpoint operation pages.","slug":"error-statuses","type":"basic","title":"Error Codes","__v":0,"childrenPages":[]}

Error Codes

You can see detailed examples under the endpoint operation pages.

200 Server request correct and accepted. The server responds with data according to the document sent and the configuration of the auto response feature. 202 Server request correct and accepted. Server doesn’t respond with any data and just serves the request. 400 Wrong request format. Server responds with details. 401 Authentication failed. 402 Request is unauthorized. The number of calls limit has been reached or the license is expired. 403 Request is forbidden. Server responds with details. 404 No documents or collections with the provided unique configuration ID were found on the server. 406 Batch, collection or other configuration limits reached. Server responds with details. 413 Character limit for single document exceeded. 500 Server side issue. Server may respond with the details in response body.
200 Server request correct and accepted. The server responds with data according to the document sent and the configuration of the auto response feature. 202 Server request correct and accepted. Server doesn’t respond with any data and just serves the request. 400 Wrong request format. Server responds with details. 401 Authentication failed. 402 Request is unauthorized. The number of calls limit has been reached or the license is expired. 403 Request is forbidden. Server responds with details. 404 No documents or collections with the provided unique configuration ID were found on the server. 406 Batch, collection or other configuration limits reached. Server responds with details. 413 Character limit for single document exceeded. 500 Server side issue. Server may respond with the details in response body.
{"category":"577e4bf24159cd1900d5d2bd","parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","updates":[],"_id":"577e4bf34159cd1900d5d329","createdAt":"2015-07-22T22:47:31.953Z","link_external":false,"link_url":"","githubsync":"","sync_unique":"","hidden":false,"api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"isReference":false,"order":107,"body":"* A trial account will expire after 30 days by default\n* For subscription accounts, there is a sliding expiration: account will expire after a certain period of inactivity\n* Legacy accounts expire after one year\n\nYour license expiration is usually set to the last day of your subscription's validity. The account will go offline at midnight GMT at the end of the last day of the term. When a license expires, Semantria will return a 402 HTTP status and send an email to the email address used when registering with Semantria.\n\nA trial account can be converted to a non-expiring paid subscription on our [pricing page](https://semantria.com/pricing).","excerpt":"","slug":"license-expiration","type":"basic","title":"License Expiration","__v":0,"childrenPages":[]}

License Expiration


* A trial account will expire after 30 days by default * For subscription accounts, there is a sliding expiration: account will expire after a certain period of inactivity * Legacy accounts expire after one year Your license expiration is usually set to the last day of your subscription's validity. The account will go offline at midnight GMT at the end of the last day of the term. When a license expires, Semantria will return a 402 HTTP status and send an email to the email address used when registering with Semantria. A trial account can be converted to a non-expiring paid subscription on our [pricing page](https://semantria.com/pricing).
* A trial account will expire after 30 days by default * For subscription accounts, there is a sliding expiration: account will expire after a certain period of inactivity * Legacy accounts expire after one year Your license expiration is usually set to the last day of your subscription's validity. The account will go offline at midnight GMT at the end of the last day of the term. When a license expires, Semantria will return a 402 HTTP status and send an email to the email address used when registering with Semantria. A trial account can be converted to a non-expiring paid subscription on our [pricing page](https://semantria.com/pricing).
{"__v":0,"_id":"577e4bf34159cd1900d5d32a","api":{"examples":{"codes":[{"language":"text","code":"GET https://api.semantria.com/features.json?language=en","name":""}]},"results":{"codes":[{"status":200,"language":"json","code":"HTTP/1.0 200 Request accepted and served.\n[\n  {\n\t\"id\": \"en\",\n\t\"language\": \"English\",\n\t\"html_processing\": true,\n\t\"settings\": {\n\t  \"blacklist\": true,\n\t  \"user_entities\": true,\n\t  \"sentiment_phrases\": true,\n\t  \"user_categories\": true,\n\t  \"queries\": true\n\t},\n\t\"detailed_mode\": {\n\t  \"language_detection\": true,\n\t  \"pos_tagging\": true,\n\t  \"intentions\": true,\n\t  \"theme_mentions\": true,\n\t  \"sentiment_phrases\": true,\n\t  \"entity_themes\": true,\n\t  \"themes\": true,\n\t  \"entity_relations\": true,\n\t  \"named_entities\": true,\n\t  \"sentiment\": true,\n\t  \"entity_mentions\": true,\n\t  \"summarization\": true,\n\t  \"user_entities\": true,\n\t  \"queries\": true,\n\t  \"auto_categories\": true,\n\t  \"user_categories\": true,\n\t  \"entity_opinions\": true\n\t},\n\t\"discovery_mode\": {\n\t  \"named_entities\": true,\n\t  \"entity_mentions\": true,\n\t  \"facet_mentioins\": true,\n\t  \"facets\": true,\n\t  \"user_entities\": true,\n\t  \"theme_mentions\": true,\n\t  \"user_categories\": true,\n\t  \"themes\": true,\n\t  \"queries\": true,\n\t  \"facet_attributes\": true\n\t}\n  }\n]","name":""},{"status":400,"language":"json","code":"{}","name":""},{"status":200,"language":"xml","code":"HTTP/1.0 200 Request accepted and served.\n<supported_features>\n  <features>\n\t<detailed_mode>\n\t  <language_detection>true</language_detection>\n\t  <pos_tagging>true</pos_tagging>\n\t  <intentions>true</intentions>\n\t  <theme_mentions>true</theme_mentions>\n\t  <sentiment_phrases>true</sentiment_phrases>\n\t  <entity_themes>true</entity_themes>\n\t  <themes>true</themes>\n\t  <entity_relations>true</entity_relations>\n\t  <named_entities>true</named_entities>\n\t  <sentiment>true</sentiment>\n\t  <entity_mentions>true</entity_mentions>\n\t  <summarization>true</summarization>\n\t  <user_entities>true</user_entities>\n\t  <queries>true</queries>\n\t  <auto_categories>true</auto_categories>\n\t  <user_categories>true</user_categories>\n\t  <entity_opinions>true</entity_opinions>\n\t</detailed_mode>\n\t<discovery_mode>\n\t  <named_entities>true</named_entities>\n\t  <entity_mentions>true</entity_mentions>\n\t  <facet_mentioins>true</facet_mentioins>\n\t  <facets>true</facets>\n\t  <user_entities>true</user_entities>\n\t  <theme_mentions>true</theme_mentions>\n\t  <user_categories>true</user_categories>\n\t  <themes>true</themes>\n\t  <queries>true</queries>\n\t  <facet_attributes>true</facet_attributes>\n\t</discovery_mode>\n\t<html_processing>true</html_processing>\n\t<id>en</id>\n\t<language>English</language>\n\t<settings>\n\t  <blacklist>true</blacklist>\n\t  <user_entities>true</user_entities>\n\t  <sentiment_phrases>true</sentiment_phrases>\n\t  <user_categories>true</user_categories>\n\t  <queries>true</queries>\n\t</settings>\n  </features>\n</supported_features>"}]},"settings":"","auth":"required","params":[{"_id":"55b16cc1b3a7e037008ac286","ref":"","required":false,"desc":"ISO language code","default":"","type":"string","name":"language","in":"query"}],"url":"/features.[json | xml]"},"body":"","category":"577e4bf24159cd1900d5d2bd","createdAt":"2015-07-23T22:37:53.265Z","editedParams":true,"editedParams2":true,"excerpt":"This method returns a list of the supported features per languages supported by the Semantria API. If no parameter is passed, Semantria will respond with a list of supported features, organized by language.","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":108,"parentDoc":null,"project":"559ae8ec7ae7f80d0096d813","slug":"checking-supported-features-by-language","sync_unique":"","title":"Checking Supported Features by Language","type":"get","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

getChecking Supported Features by Language

This method returns a list of the supported features per languages supported by the Semantria API. If no parameter is passed, Semantria will respond with a list of supported features, organized by language.

Query Params

language:
string
ISO language code

Definition

{{ api_url }}{{ page_api_url }}

Examples


Result Format



{"__v":0,"_id":"57d704548bd2f30e004aeb21","api":{"results":{"codes":[{"status":200,"language":"json","code":"{}","name":""},{"status":400,"language":"json","code":"{}","name":""}]},"settings":"","auth":"required","params":[],"url":""},"body":"[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Introduction to Pubnub\"\n}\n[/block]\nPubNub is low latency, real time message passing framework which enables communication at a global scale. Messages are sent and received over communication “channels” through the PubNub data stream network using the PubNub API. [PubNub BLOCKS](https://www.pubnub.com/products/blocks/) are micro-services which have the power to alter and passively monitor these messages mid-flight.\n\nLexalytics now offers a text analytics BLOCK which makes the power of Semantria available to the PubNub community. Properly formatted documents published to the “seminput” channel are forwarded to Semantria for processing. The results are then published on the “semoutput” channel.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Block Operation\"\n}\n[/block]\nEach publication to the “seminput” channel will initiate a process which copies the input documents to a local queue and then attempts to perform the following steps:\n\n1. Submit the locally queued documents to Semantria.\n\n2. Retrieve processed documents from Semantria and publish them to the “semoutput” channel.\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"Step 1 (submission to Semantria) will only initiate when the number of documents in the local queue exceeds the Batch Limit of the Semantria subscription OR when the elapsed time since the last submission to Semantria exceeds the Batch Delay of 30 seconds. The Batch Limit is the maximum number of documents that can be sent to Semantria in a single batch.\"\n}\n[/block]\n\n[block:callout]\n{\n  \"type\": \"warning\",\n  \"body\": \"Step 2 (retrieval from Semantria) will only initiate if the elapsed time since the last retrieval from Semantria exceeds the Poll Delay of 30 seconds.\"\n}\n[/block]\nThese limitations are in place to lessen the communication burden between PubNub and Semantria.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Submitting Documents\"\n}\n[/block]\nTo use the block, documents need to be published as a properly formatted JSON object to the “seminput” channel. Here is an example with two documents:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n\\t“docs”: [\\n\\n\\t\\t\\t\\t\\t\\t{“text”: “This is document #1.”},\\n\\n\\t\\t\\t\\t\\t\\t{“text”: “This is document #2.”, “id”: “123”}\\n\\t]\\n}\",\n      \"language\": \"json\"\n    }\n  ]\n}\n[/block]\nNote that the documents are sent as dictionaries in an array called “docs”. The text of each document is defined by the “text” field. An optional document ID can be specified using the “id” field. If an ID is not specified, then a random, numeric ID will be generated. All fields must be strings.\n[block:api-header]\n{\n  \"type\": \"basic\",\n  \"title\": \"Retrieving Documents\"\n}\n[/block]\nThe block will request to poll Semantria for results every time it receives a message on “seminput”, regardless of whether or not the message included any documents. If results are returned from Semantria, then they will be posted to “semoutput”.\n\nTo attempt a poll Semantria (or attempt submission) without sending new documents, a blank message can be published to “seminput”:\n[block:code]\n{\n  \"codes\": [\n    {\n      \"code\": \"{\\n\\n}\",\n      \"language\": \"json\"\n    }\n  ]\n}\n[/block]","category":"57d6ff9e4340330e00953d3c","createdAt":"2016-09-12T19:39:00.293Z","excerpt":"","githubsync":"","hidden":false,"isReference":false,"link_external":false,"link_url":"","order":999,"project":"559ae8ec7ae7f80d0096d813","slug":"integration-pubnub","sync_unique":"","title":"Pubnub","type":"basic","updates":[],"user":"559ae88c7ae7f80d0096d812","version":"577e4bf24159cd1900d5d2aa","childrenPages":[]}

Pubnub


[block:api-header] { "type": "basic", "title": "Introduction to Pubnub" } [/block] PubNub is low latency, real time message passing framework which enables communication at a global scale. Messages are sent and received over communication “channels” through the PubNub data stream network using the PubNub API. [PubNub BLOCKS](https://www.pubnub.com/products/blocks/) are micro-services which have the power to alter and passively monitor these messages mid-flight. Lexalytics now offers a text analytics BLOCK which makes the power of Semantria available to the PubNub community. Properly formatted documents published to the “seminput” channel are forwarded to Semantria for processing. The results are then published on the “semoutput” channel. [block:api-header] { "type": "basic", "title": "Block Operation" } [/block] Each publication to the “seminput” channel will initiate a process which copies the input documents to a local queue and then attempts to perform the following steps: 1. Submit the locally queued documents to Semantria. 2. Retrieve processed documents from Semantria and publish them to the “semoutput” channel. [block:callout] { "type": "warning", "body": "Step 1 (submission to Semantria) will only initiate when the number of documents in the local queue exceeds the Batch Limit of the Semantria subscription OR when the elapsed time since the last submission to Semantria exceeds the Batch Delay of 30 seconds. The Batch Limit is the maximum number of documents that can be sent to Semantria in a single batch." } [/block] [block:callout] { "type": "warning", "body": "Step 2 (retrieval from Semantria) will only initiate if the elapsed time since the last retrieval from Semantria exceeds the Poll Delay of 30 seconds." } [/block] These limitations are in place to lessen the communication burden between PubNub and Semantria. [block:api-header] { "type": "basic", "title": "Submitting Documents" } [/block] To use the block, documents need to be published as a properly formatted JSON object to the “seminput” channel. Here is an example with two documents: [block:code] { "codes": [ { "code": "{\n\t“docs”: [\n\n\t\t\t\t\t\t{“text”: “This is document #1.”},\n\n\t\t\t\t\t\t{“text”: “This is document #2.”, “id”: “123”}\n\t]\n}", "language": "json" } ] } [/block] Note that the documents are sent as dictionaries in an array called “docs”. The text of each document is defined by the “text” field. An optional document ID can be specified using the “id” field. If an ID is not specified, then a random, numeric ID will be generated. All fields must be strings. [block:api-header] { "type": "basic", "title": "Retrieving Documents" } [/block] The block will request to poll Semantria for results every time it receives a message on “seminput”, regardless of whether or not the message included any documents. If results are returned from Semantria, then they will be posted to “semoutput”. To attempt a poll Semantria (or attempt submission) without sending new documents, a blank message can be published to “seminput”: [block:code] { "codes": [ { "code": "{\n\n}", "language": "json" } ] } [/block]
[block:api-header] { "type": "basic", "title": "Introduction to Pubnub" } [/block] PubNub is low latency, real time message passing framework which enables communication at a global scale. Messages are sent and received over communication “channels” through the PubNub data stream network using the PubNub API. [PubNub BLOCKS](https://www.pubnub.com/products/blocks/) are micro-services which have the power to alter and passively monitor these messages mid-flight. Lexalytics now offers a text analytics BLOCK which makes the power of Semantria available to the PubNub community. Properly formatted documents published to the “seminput” channel are forwarded to Semantria for processing. The results are then published on the “semoutput” channel. [block:api-header] { "type": "basic", "title": "Block Operation" } [/block] Each publication to the “seminput” channel will initiate a process which copies the input documents to a local queue and then attempts to perform the following steps: 1. Submit the locally queued documents to Semantria. 2. Retrieve processed documents from Semantria and publish them to the “semoutput” channel. [block:callout] { "type": "warning", "body": "Step 1 (submission to Semantria) will only initiate when the number of documents in the local queue exceeds the Batch Limit of the Semantria subscription OR when the elapsed time since the last submission to Semantria exceeds the Batch Delay of 30 seconds. The Batch Limit is the maximum number of documents that can be sent to Semantria in a single batch." } [/block] [block:callout] { "type": "warning", "body": "Step 2 (retrieval from Semantria) will only initiate if the elapsed time since the last retrieval from Semantria exceeds the Poll Delay of 30 seconds." } [/block] These limitations are in place to lessen the communication burden between PubNub and Semantria. [block:api-header] { "type": "basic", "title": "Submitting Documents" } [/block] To use the block, documents need to be published as a properly formatted JSON object to the “seminput” channel. Here is an example with two documents: [block:code] { "codes": [ { "code": "{\n\t“docs”: [\n\n\t\t\t\t\t\t{“text”: “This is document #1.”},\n\n\t\t\t\t\t\t{“text”: “This is document #2.”, “id”: “123”}\n\t]\n}", "language": "json" } ] } [/block] Note that the documents are sent as dictionaries in an array called “docs”. The text of each document is defined by the “text” field. An optional document ID can be specified using the “id” field. If an ID is not specified, then a random, numeric ID will be generated. All fields must be strings. [block:api-header] { "type": "basic", "title": "Retrieving Documents" } [/block] The block will request to poll Semantria for results every time it receives a message on “seminput”, regardless of whether or not the message included any documents. If results are returned from Semantria, then they will be posted to “semoutput”. To attempt a poll Semantria (or attempt submission) without sending new documents, a blank message can be published to “seminput”: [block:code] { "codes": [ { "code": "{\n\n}", "language": "json" } ] } [/block]