run.py is the front door for the stock rating experiment.
You give it a ticker, like AAPL. It fetches market and company data, asks a few different kinds of questions about the stock, then prints a report with a final signal such as BUY, HOLD, or SELL.
The idea is simple: do not ask an LLM to guess whether a stock is good. First collect evidence. Then let normal Python code handle the numbers, and use the LLM only where reading and judgment are useful.
python run.py AAPL
The program follows this path:
run.py reads the ticker and selected pillars from the command line..env so it knows which LLM provider to use.fundamental.py fetches stock data from yfinance.scoring.py combines the scores into one weighted result.scoring.py prints a readable console report.The evaluator has three analysis pillars.
| Pillar | File | What it answers |
|---|---|---|
| 1. Fundamental | fundamental.py |
Is this a financially strong business at a reasonable price? |
| 2. Technical | technical.py |
Is the market currently moving toward or away from the stock? |
| 3. Qualitative | qualitative.py |
What does the company story, management tone, and filing language suggest? |
You can run all pillars:
python run.py AAPL
Or only specific pillars:
python run.py AAPL --pillars 1,2
python run.py AAPL --pillars 3
--pillars accepts a comma-separated list using:
| Number | Meaning |
|---|---|
1 |
Fundamental analysis |
2 |
Technical analysis |
3 |
Qualitative LLM analysis |
run.pyThis is the orchestrator. It does not contain most of the scoring logic itself.
Its main job is to:
.envThe main function is:
evaluate(ticker: str, provider: str, pillars: list[int] = None)
fundamental.pyThis file handles the numbers behind the business.
It uses yfinance to fetch:
Then score_fundamental() scores:
Each dimension is scored from -2 to +2.
technical.pyThis file scores momentum from price data.
It checks:
The output is one score:
-2 to +2qualitative.pyThis is where the LLM is used.
It supports:
The qualitative pillar asks the LLM to score:
It also runs two deeper checks:
yfinanceThe LLM is expected to return JSON. _extract_json() tries to recover valid JSON even if the model wraps the answer in extra text.
scoring.pyThis file turns all the evidence into the final signal.
It contains:
The default weights are:
| Dimension | Weight |
|---|---|
| Quality | 30% |
| Growth | 25% |
| Valuation | 20% |
| Momentum | 15% |
| Risk | 10% |
One important detail: the LLM qualitative score is blended into the quality dimension. If both fundamental quality and qualitative quality exist, they are averaged together.
After the dimensions are scored, combine_scores() calculates a weighted total.
| Weighted total | Signal |
|---|---|
>= 1.0 |
BUY |
>= 0.5 |
ACCUMULATE (weak buy) |
>= -0.5 |
HOLD |
>= -1.0 |
REDUCE (weak sell) |
< -1.0 |
SELL |
Get the code:
git clone https://github.com/pg1/paper-profit.git
cd paper-profit
Install the Python packages:
pip install yfinance anthropic openai python-dotenv requests
Create or edit .env in this directory:
LLM_PROVIDER=anthropic
ANTHROPIC_API_KEY=your_key_here
OPENAI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here
Only the key for the selected provider is required.
Supported LLM_PROVIDER values:
| Value | Provider | Model used |
|---|---|---|
anthropic |
Anthropic | claude-opus-4-5 |
openai |
OpenAI | gpt-4o |
deepseek |
DeepSeek | deepseek-chat |
If LLM_PROVIDER is missing, run.py defaults to anthropic.
The evaluator can show red flags even when the total score looks okay.
Automatic red flags include:
The LLM can also add red flags from:
run.py deduplicates these before printing the report.
The code tries to keep the report running when optional data is missing.
Examples:
Think of the script as a small analyst team:
fundamental.py is the accountant.technical.py is the chart watcher.qualitative.py is the filing reader.scoring.py is the editor who turns everyone else’s notes into one clear report.run.py is the person at the desk making sure each specialist gets called in the right order.The goal is not to predict the future perfectly. The goal is to create a repeatable process that looks at a stock from several angles before making a judgment.