Independent performance comparisons against industry-standard sentiment analysis tools
Head-to-head comparison on the Financial PhraseBank academic benchmark dataset
We used the Financial PhraseBank dataset from HuggingFace - an academic benchmark containing financial news sentences with expert-annotated sentiment labels.
| Sentence | Ground Truth |
|---|---|
| Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in 2007. | positive |
| Sales in Finland decreased by 10.5% in January. | negative |
| The company has operations in Finland, Sweden and Norway. | neutral |
Both tools were run on the same 306 sentences. Snowflake Cortex was accessed via SQL, and results were exported for comparison.
-- Snowflake Cortex Sentiment Analysis
SELECT
FULL_SENTENCE,
SNOWFLAKE.CORTEX.SENTIMENT(FULL_SENTENCE) AS sentiment_score,
CASE
WHEN sentiment_score >= 0.60 THEN 'STRONG_POSITIVE'
WHEN sentiment_score >= 0.25 THEN 'POSITIVE'
WHEN sentiment_score > -0.25 AND sentiment_score < 0.25 THEN 'NEUTRAL'
WHEN sentiment_score <= -0.25 THEN 'NEGATIVE'
ELSE 'STRONG_NEGATIVE'
END AS sentiment_label
FROM FINANCIAL_PHRASEBANK_SENTENCES;
from ones_rs.ones_rs import OnesEngine
from sklearn.metrics import accuracy_score, f1_score
# Initialize ONES-RS with financial domain lexicon
engine = OnesEngine()
engine.load_lexicon('financial_lexicon.json')
# Analyze all sentences
results = engine.analyze_batch_auto(sentences)
predictions = [r[2] for r in results] # sentiment label
# Calculate metrics
accuracy = accuracy_score(ground_truth, predictions)
f1_macro = f1_score(ground_truth, predictions, average='macro')
ONES-RS significantly outperforms Snowflake Cortex on financial text sentiment analysis.
| Metric | Snowflake Cortex | ONES-RS | Winner |
|---|---|---|---|
| Accuracy | 48.04% | 68.63% | ONES-RS (+20.59%) |
| Macro F1 | 0.461 | 0.519 | ONES-RS (+0.058) |
| Weighted F1 | 0.487 | 0.673 | ONES-RS (+0.186) |
| Neutral F1 | 0.524 | 0.781 | ONES-RS (+0.257) |
| Positive F1 | 0.431 | 0.653 | ONES-RS (+0.222) |
| Negative F1 | 0.427 | 0.122 | Snowflake (+0.305) |
| Speed | Cloud API | 350K+ texts/sec | ONES-RS |
F1 scores by sentiment class:
ONES-RS uses domain-specific lexicons optimized for financial terminology, unlike general-purpose sentiment models.
Financial text is predominantly neutral (59.8%). ONES-RS achieves F1=0.781 vs Snowflake's 0.524 on this critical class.
ONES-RS processes 350,000+ texts per second locally, with no cloud API latency or rate limits.
Full comparison code to reproduce these results:
import pandas as pd
from sklearn.metrics import accuracy_score, f1_score, classification_report
from ones_rs.ones_rs import OnesEngine
# Load Snowflake export
df = pd.read_csv('snowflake_export.csv')
# Extract ground truth from "sentence@label" format
df['ground_truth'] = df['FULL_SENTENCE'].apply(
lambda x: x.rsplit('@', 1)[1].strip() if '@' in str(x) else None
)
df = df[df['ground_truth'].notna()]
# Map Snowflake labels to 3-class
def map_label(label):
label = str(label).lower()
if 'positive' in label: return 'positive'
elif 'negative' in label: return 'negative'
return 'neutral'
df['snowflake_pred'] = df['SENTIMENT_LABEL'].apply(map_label)
# Run ONES-RS
engine = OnesEngine()
engine.load_lexicon('financial_lexicon.json')
sentences = [s.rsplit('@', 1)[0] for s in df['FULL_SENTENCE']]
results = engine.analyze_batch_auto(sentences)
df['ones_pred'] = [r[2] for r in results]
# Compare
print("Snowflake Accuracy:", accuracy_score(df['ground_truth'], df['snowflake_pred']))
print("ONES-RS Accuracy:", accuracy_score(df['ground_truth'], df['ones_pred']))
print("\nONES-RS Classification Report:")
print(classification_report(df['ground_truth'], df['ones_pred']))
Experience the performance advantage of domain-optimized sentiment analysis for your financial documents.
Dataset: Financial PhraseBank by Malo et al. | Snowflake Cortex accessed December 2024