ONES-RS Benchmarks

Independent performance comparisons against industry-standard sentiment analysis tools

Benchmark

ONES-RS vs Snowflake Cortex Sentiment

Head-to-head comparison on the Financial PhraseBank academic benchmark dataset

1

Dataset: Financial PhraseBank

We used the Financial PhraseBank dataset from HuggingFace - an academic benchmark containing financial news sentences with expert-annotated sentiment labels.

306 Test Sentences
3 Sentiment Classes
59.8% Neutral
29.4% Positive
Sample Data Financial PhraseBank
Sentence Ground Truth
Operating profit rose to EUR 13.1 mn from EUR 8.7 mn in 2007. positive
Sales in Finland decreased by 10.5% in January. negative
The company has operations in Finland, Sweden and Norway. neutral
2

Methodology

Both tools were run on the same 306 sentences. Snowflake Cortex was accessed via SQL, and results were exported for comparison.

SQL snowflake_sentiment.sql
-- Snowflake Cortex Sentiment Analysis
SELECT
    FULL_SENTENCE,
    SNOWFLAKE.CORTEX.SENTIMENT(FULL_SENTENCE) AS sentiment_score,
    CASE
        WHEN sentiment_score >= 0.60 THEN 'STRONG_POSITIVE'
        WHEN sentiment_score >= 0.25 THEN 'POSITIVE'
        WHEN sentiment_score > -0.25 AND sentiment_score < 0.25 THEN 'NEUTRAL'
        WHEN sentiment_score <= -0.25 THEN 'NEGATIVE'
        ELSE 'STRONG_NEGATIVE'
    END AS sentiment_label
FROM FINANCIAL_PHRASEBANK_SENTENCES;
Python ones_rs_analysis.py
from ones_rs.ones_rs import OnesEngine
from sklearn.metrics import accuracy_score, f1_score

# Initialize ONES-RS with financial domain lexicon
engine = OnesEngine()
engine.load_lexicon('financial_lexicon.json')

# Analyze all sentences
results = engine.analyze_batch_auto(sentences)
predictions = [r[2] for r in results]  # sentiment label

# Calculate metrics
accuracy = accuracy_score(ground_truth, predictions)
f1_macro = f1_score(ground_truth, predictions, average='macro')
3

Results

ONES-RS significantly outperforms Snowflake Cortex on financial text sentiment analysis.

Results 306 sentences
Metric Snowflake Cortex ONES-RS Winner
Accuracy 48.04% 68.63% ONES-RS (+20.59%)
Macro F1 0.461 0.519 ONES-RS (+0.058)
Weighted F1 0.487 0.673 ONES-RS (+0.186)
Neutral F1 0.524 0.781 ONES-RS (+0.257)
Positive F1 0.431 0.653 ONES-RS (+0.222)
Negative F1 0.427 0.122 Snowflake (+0.305)
Speed Cloud API 350K+ texts/sec ONES-RS

Per-Class Performance

F1 scores by sentiment class:

Neutral
Snowflake: 0.524 | ONES-RS: 0.781
Positive
Snowflake: 0.431 | ONES-RS: 0.653
Negative
Snowflake: 0.427 | ONES-RS: 0.122

Key Findings

Financial Domain Optimization

ONES-RS uses domain-specific lexicons optimized for financial terminology, unlike general-purpose sentiment models.

Neutral Detection Excellence

Financial text is predominantly neutral (59.8%). ONES-RS achieves F1=0.781 vs Snowflake's 0.524 on this critical class.

Extreme Speed

ONES-RS processes 350,000+ texts per second locally, with no cloud API latency or rate limits.

4

Reproduce This Benchmark

Full comparison code to reproduce these results:

Python benchmark_comparison.py
import pandas as pd
from sklearn.metrics import accuracy_score, f1_score, classification_report
from ones_rs.ones_rs import OnesEngine

# Load Snowflake export
df = pd.read_csv('snowflake_export.csv')

# Extract ground truth from "sentence@label" format
df['ground_truth'] = df['FULL_SENTENCE'].apply(
    lambda x: x.rsplit('@', 1)[1].strip() if '@' in str(x) else None
)
df = df[df['ground_truth'].notna()]

# Map Snowflake labels to 3-class
def map_label(label):
    label = str(label).lower()
    if 'positive' in label: return 'positive'
    elif 'negative' in label: return 'negative'
    return 'neutral'

df['snowflake_pred'] = df['SENTIMENT_LABEL'].apply(map_label)

# Run ONES-RS
engine = OnesEngine()
engine.load_lexicon('financial_lexicon.json')
sentences = [s.rsplit('@', 1)[0] for s in df['FULL_SENTENCE']]
results = engine.analyze_batch_auto(sentences)
df['ones_pred'] = [r[2] for r in results]

# Compare
print("Snowflake Accuracy:", accuracy_score(df['ground_truth'], df['snowflake_pred']))
print("ONES-RS Accuracy:", accuracy_score(df['ground_truth'], df['ones_pred']))
print("\nONES-RS Classification Report:")
print(classification_report(df['ground_truth'], df['ones_pred']))

Try ONES-RS on Your Data

Experience the performance advantage of domain-optimized sentiment analysis for your financial documents.

Dataset: Financial PhraseBank by Malo et al. | Snowflake Cortex accessed December 2024