Semantria scores sentiment based on a pre-configured dictionary of phrases that are broadly applicable to many domains. However, each domain also has specific phrases that differ from the broad usage. You can increase sentiment accuracy by editing your configuration.
Sentiment phrases can be from one to three words long or be a Boolean query. The longest phrase will win if there are sub-phrases found. For instance, the word "crude" is scored as a negative out of the box, but the phrase "crude oil" is not sentiment bearing. Since "crude oil" is configured as neutral and is longer than "crude" when we see the phrase "crude oil" it will not be given sentiment.
Part of speech plays a role. We don't want to assign sentiment to proper nouns generally - "love" may be positive, but not "Courtney Love". Thus, not only does the phrase have to exist in the text, it has to match the proper parts of speech.
Phrases also obey negators and intensifiers, such as not and very. Because of this, you should usually not enter a negated phrase such as "not good." Enter the phrase that carries the sentiment (good) and let the NLP engine figure out the negation.
When you add sentiment phrases to a configuration, you can give them a score of -2 to +2. We recommend keeping the scores within -1 to +1, but for particularly strong words, you can exceed that.
When tuning, pay attention to the frequency of the words in your data set and focus on the most frequently occurring ones. Also, think about alternative uses that might not be sentiment bearing, especially with single words. For instance, "garbage" might seem on the surface to be always negative, but "Taking out the garbage" is likely not sentiment bearing, and certainly "garbage truck" or "garbage collectors" are not. Below are some examples.
small NEAR/3 screen
"worst experience ever"