PYTHON
[|]
#Part 2

Exploring Finance APIs

This second part in the series covers several well-known finance APIs that you can use in your Python code to obtain and analyse stock data
FREE Analytical Course by PythonInvest
Discussion in Telegram
Screencasts on Youtube
Articles on Medium
Code on Github

Introduction

This second part in the series covers several well-known finance APIs that you can use in your Python code to obtain and analyse stock data. In particular, we'll focus on those APIs that are available for free or mostly for free, meaning your requests to the key services and most of the data provided by an API can be done for free.

This part assumes that you've already installed a Python environment on your local machine and have familiarised yourself with Google Colab, as discussed in the part 1 of this series. I'll encourage you to use Google Colab to try the examples in this article.
Executive Summary
In this chapter we've tried to query the Yahoo Finance library to obtain a stock's daily trading info and some financial indicators of a company, used Pandas Datareader to get the S&P500's daily prices, Quandl for the prices of Gold, and Get_all_tickers (or direct CSV files links) for the list of a traded stocks.

APIs to Obtain and Analyse Stock Data

Actually, you have several options when it comes to getting stock data programmatically. The most natural way to obtain stock data is via an API. The data can be received from an API either through an HTTP request (requests library), or a python-wrapper library for the API. In this article, you'll look at the following APIs:
  • Yahoo Finance API
  • Quandl API
  • Pandas-datareader
  • get-all-tickers library (a Python wrapper to an API provided by the NASDAQ marketplace)
In this part, we won't go too deep into financial analysis tasks you can perform with these APIs but rather will look at their general capabilities. However, you'll get a chance to get your hands dirty with the APIs.

General rules for API selection

There are many parameters you should take into consideration before choosing an API:
  • Functionality
    It is quite rare when one datasource has enough data to cover all your needs. Thus, Yahoo finance API can provide only daily data, and not hourly and minutes cut.
  • Free / Paid
    Many data sources have free tier to try their data (with a limited amount of calls per day), while paid options can have more granularity and no limits on usage.
  • Stability
    You need to check when was the last release and how often the datasource is updated. Many Python libraries have their pages on pypi.org, where you can find the stats on the number of installs (more is better), GitHub stars and forks (more is better), current health status. For small projects, you should assume that the datasource can be unreachable sometimes, can return null value or error at any moment.
  • Documentation
    It can be very handy to see all the API calls details covered in one place. For example, Quandl library has a separate web page for many of its time series with a profound description and code snippets for different programming languages as this one https://www.quandl.com/data/LBMA/GOLD-Gold-Price-London-Fixing

A Step-by-Step Guide on Youtube

Yahoo Finance API

I will summarise its main principles here:
As mentioned in the previous section, it is very useful to look at the PyPi website page of the library (https://pypi.org/project/yfinance/), which can tell you that the project is quite popular mid-July 2020: it has 165k installs per month, 2300 stars on GitHub, 6700 followers on Twitter, last version was released only half-year ago in December 2019.

Yahoo Finance API allows you to make 2000 requests per IP per hour => 48k requests per day. Before starting to obtain stock data programmatically with Yahoo Finance API, let's look at the Yahoo Finance website in a browser to explicitly see what can be found there. Suppose you want to look at historical stock prices for Pfizer Inc.(PFE). To accomplish this, point your browser to https://finance.yahoo.com/quote/PFE/history?p=PFE

The key fragment of what you'll see in your browser is shown in the following screenshot:
Figure-1: Yahoo Finance data for Pfizer Inc. (PFE)
Figure-1: Yahoo Finance data for Pfizer Inc. (PFE)
Let's now try to get some stock data programmatically from Yahoo Finance. To start with, go to Google Colab as it was described in the previous part 1, and open a new notebook to be used for the examples in this article.

To start with, install the yfinance library in your notebook:
Py1. Installing yfinance

!pip install yfinance
Now, suppose you first want to look at some general — including financial — information about the company of interest. This can be done as follows (you should use a new code cell in your Colab notebook for this code):
Py2. Checking yfinance

import yfinance as yf

pfe = yf.Ticker(‘PFE’)
pfe.info
The output is truncated to save space:
Py3. Info section for one ticker from yfinance

{
‘zip’: ‘10017’,
‘sector’: ‘Healthcare’,
‘fullTimeEmployees’: 88300,
‘longBusinessSummary’: ‘Pfizer Inc. develops, manufactures, and sells healthcare products worldwide. It offers …’
‘city’: ‘New York’,
‘phone’: ‘212–733–2323’,
‘state’: ‘NY’,
‘country’: ‘United States’,
…
‘profitMargins’: 0.31169,
‘enterpriseToEbitda’: 11.87,
‘52WeekChange’: -0.15343297,
…
}
For details on dividends and stock splits, you can take advantage of the action property of the Ticker object:
Py4. Getting actions (dividends and stock splits)

pfe.actions
In this particular example, this should produce the following output:
Py5. Output with actions for the ticker PFE (Pfizer Inc.)

Date Dividends Stock Splits
1972–08–29 0.00333 0.0
1972–11–28 0.00438 0.0
1973–02–28 0.00333 0.0
1973–05–30 0.00333 0.0
1973–08–28 0.00333 0.0
… … …
2019–05–09 0.36000 0.0
2019–08–01 0.36000 0.0
2019–11–07 0.36000 0.0
2020–01–30 0.38000 0.0
2020–05–07 0.38000 0.0
It started from as much as $0.00333 dividend in cash per 1 stock in 1972, and finished with $0.38 in 2020.

Dividends can affect stock's price in many ways and change the patterns of growth observed before. For example, a company may increase the dividend rate at some moment of time trying to show that it is ready to give back more of its earnings to shareholders and increase their investment income. It can move the stock price upwards, if the market believes that company's management is eager to continue paying high dividends.

Now suppose you want to obtain historical stock prices for Pfizer Inc. over the past six months. This can be done as follows:
Py6. Six month history for the stock prices

hist = pfe.history(period=”6mo”)
Depending on your needs, you can specify another period. Your options include: 1d, 5d, 1mo, 3mo, 6mo, 1y, 2y, 5y, 10y, ytd, max. Apparently, the hist variable shown in the previous code snippet contains the stock data we have requested. If so, in what format? This can be instantly clarified as follows:
Py7. The format of a data received from yfinance

type(hist)


<class pandas.core.frame.DataFrame’>
As you can see, yfinance returns data in the pandas dataframe format. So you can use the pandas' info() function to print a concise summary of the dataframe:
Py8. The detailed info about the dataframe from yfinance

hist.info()



<class ‘pandas.core.frame.DataFrame’>
DatetimeIndex: 125 entries, 2020–01–13 to 2020–07–10
Data columns (total 7 columns):
# Column Non-Null Count Dtype
 — — — — — — — — — — — — — — -
0 Open 125 non-null float64
1 High 125 non-null float64
2 Low 125 non-null float64
3 Close 125 non-null float64
4 Volume 125 non-null int64
5 Dividends 125 non-null float64
6 Stock Splits 125 non-null int64
Suppose you're interested in open prices only. The necessary selection from the original dataframe can be done as follows:
Py9. Saving one slice of data to a separate dataframe

df1 = hist[[‘Open’]]
print(df1)
Now if you print out the df1 dataframe variable, you'll see the following output:
Py10. Open date daily stats

Open Date
2020–01–13 38.83
2020–01–14 38.65
2020–01–15 39.39
2020–01–16 39.98
2020–01–17 39.76
… …
2020–07–06 34.95
2020–07–07 34.05
2020–07–08 34.01
2020–07–09 33.73
2020–07–10 33.66

Some Examples on Using Yahoo Finance API

Now that you know how yfinance works in general, let's look at how it might be used in some simple examples. Say, you want to look at 1-year stock price history for the following companies:
Py11. Defining the set of tickers

tickers = [‘TSLA’, ‘API’, ‘LMND’,’MRK’]
Some info on the above tickers and their recent performance:
  • TSLA (Tesla Inc)
    shows the most impressive growth. Despite many investors were "shorting" the stock (betting on its decrease) — it showed an amazing growth in the recent months.
  • API (Agora Inc) and LMND (Lemonade Inc)
    companies that had IPO recently. Their price is quite volatile in the first months: it could jump 10–20% just in matter of days. This gives a good opportunity to make profits quickly, but also bears more risk, as these stocks can go down quickly as well.
  • MRK (Merck & Co., Inc.)
    as many other stocks in most of the verticals had a drop around 2020–03 (Covid-19 effect), now restored to the previous year levels.
For clarity, you might want to make a plot for each company:
Py12. One year of historic prices

import matplotlib.pyplot as plt
for i,ticker in enumerate(tickers):
  current_ticker = yf.Ticker(ticker)
  plt.subplot(len(tickers),1,i+1)
  current_ticker.history(period='365d')['Close'].plot(figsize=   (16,60), title='1 year price history for ticker: '+ticker)
The plot for Tesla Inc might look as illustrated in the following figure:
Figure-2: Tesla Inc. 1 year stock price
Figure-2: Tesla Inc. 1 year stock price
Continuing with this example, suppose you want to look at a particular financial parameter of a certain company from the list of tickers you defined here:
Py13. Getting info for first ticker

ticker = tickers[0]
yf_info = yf.Ticker(ticker).info
print(ticker)

# Output:
# TSLA
You already saw an example of using the info property of a Ticker object in the beginning of this section. If you recall, the info includes a lot of parameters related to the company, including both general and financial ones. You can extract the necessary one as follows:
Py14. 52 Weeks change in price is approximately 1 year change

#an easy way to get 1 year stock growth
yf_info[‘52WeekChange’]
4.8699937
The following example illustrates how you can compare two financial parameters: 52WeekChange and profitMargins for several tickers:
Py15. Profit margins vs. 52-weeks change

stock_52w_change = []
profitsMargins = []
tickers = ['NVS','JNJ','ABBV','AMGN']
for ticker in tickers:
  print(ticker)
  current_ticker = yf.Ticker(ticker)
  current_ticker_info = current_ticker.info
  stock_52w_change.append(current_ticker_info['52WeekChange'])
  profitsMargins.append(current_ticker_info['profitMargins'])
You'll combine the stock_52w_change and profitsMargins lists created in the above code cell into a Pandas dataframe:
Py15. Profit margins vs. 52-weeks change

import pandas as pd
df = pd.DataFrame([stock_52w_change, profitsMargins], columns=tickers, index={'52w change', 'profitMargins'})

print(df)

# Output
#                      NVS          JNJ         ABBV        AMGN
# profitMargins     -0.06242     0.160992    0.482794    0.469441
# 52w change         0.24318     0.188630    0.247700    0.320250
You might also want to look at a visual representation of this comparing:
Py16. Graph visualisation for profit margins vs. 52 weeks growth

import matplotlib.ticker as mtick
ax = df.plot.bar()
ax.yaxis.set_major_formatter(mtick.PercentFormatter(xmax=1))
ax.set_title('Comparing Profit Margins and 52 weeks growth rates for pharma stocks')
This code should generate the following bar:
Figure-3: Profit margins vs. 52-weeks change
Figure-3: Profit margins vs. 52-weeks change
Interestingly, four companies from the same Pharma sector have a different pattern in one of the most important financial ratios profitMargins (profitMargins = [Net Income / Net Sales] *100%) from -6% to 48%, but stocks price increased from 18% to 32% for all of them (52 weeks change =~ 1 Year change).

Getting Data for the S&P500 Index

It's a common practice to compare the stock performance and 'health check' to the index value, which represents a composite aggregated average performance over the set of stocks. The most well-known index is probably S&P500. You can get it from Pandas Datareader, which uses Stooq company data as one of its sources. The list of all indexes available is on this page.

With the following code, you can create a plot for 1 year price history for index S&P500:
Py 27. S&P500 data retrieved with Pandas Data reader

import pandas_datareader.data as pdr
from datetime import date
end = date.today()
start = datetime(year=end.year-1, month=end.month, day=end.day-2)
# More information of the datasource:
spx_index = pdr.get_data_stooq('^SPX', start, end)
spx_index['Close'].plot(title='1 year price history for index S&P500')
This should generate the following plot:
Figure-4: S&amp;P 500 index history, July-2019 to July-2020
Figure-4: S&P 500 index history, July-2019 to July-2020
As you can see from the plot, S&P500 was around 3000 1 year ago, then it had a drop to 2200 in March, now it shows a moderate increase to 3200 (~8% increase in 1 year). But still the last 3 months it is a 'bullish' market showing constantly growing prices of a stock index and many individual stocks.

Quandl API

Quandl brings together millions of financial and economic datasets from hundreds of sources, providing access to them via a single free API. This diversity of sources enables you to look at the other classes of investment, say gold or bitcoin — to compare them with stocks performance. To make this API available from your Colab, install the Python wrapper for it with the following command:
Py 28. Installing Quandl

!pip install quandl
After successful installation, you can start using it as illustrated in the following example. Here, you request the prices for gold in London that are globally considered as the international standard for pricing of gold.
Py 29. Getting Gold prices from Quandl

london_fixing_gold_price = quandl.get("LBMA/GOLD",start_date=start, end_date=end, authtoken=<your auth token>)
(You will need to create an account, describe the purpose of using Quandl data, and get a FREE token).

The Gold price in London is set twice a day, so you might want to look at your options before making a plot:
Py 30. List of columns returned by Quandl

print(london_fixing_gold_price.columns)

# Output:
# Index([‘USD (AM)’, ‘USD (PM)’, ‘GBP (AM)’, ‘GBP (PM)’, ‘EURO (AM)’, ‘EURO (PM)’], dtype=’object’)
Suppose you want to look at the morning prices in USD:
Py 31. Gold Morning prices

london_fixing_gold_price['USD (AM)'].plot(figsize=(20,5), title="Gold Price: London Fixing in USD(AM price)"), plt.show();
The generated plot might look as follows:
Figure-5: One year of a daily price of gold
Figure-5: One year of a daily price of gold
Py 32. One year of return for gold

#a harder way to get 1 year growth, controlling the exact dates
london_fixing_gold_price['USD (AM)']['2020–07–17'] / london_fixing_gold_price['USD (AM)']['2019–07–17']

# Output:
# 1.2870502569960023

List of Tickers

Sometimes you may simply need to get a complete list of ticker symbols from a certain marketplace or several marketplaces. This is where the get-all-tickers library may come in very handy. In this section, we'll touch upon this library, which is actually a wrapper to an API provided by the NASDAQ marketplace. This library allows you to retrieve all the tickers from the three most popular stock exchanges: NYSE, NASDAQ, and AMEX.

The library is an open source, which you can find in this GitHub repository. If you explore the code, you may discover that it extracts the tickers from several CSV files:
Py 33. Tickers from three popular exchanges stored in CSV files

_NYSE_URL = 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nyse&render=download'

_NASDAQ_URL = 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=nasdaq&render=download'

_AMEX_URL = 'https://old.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=amex&render=download'
So, using the above links, you can obtain a list of tickers directly as a file (and then import it to Python), without the use of the library — especially, when your request to the library hangs (sometimes it happens).

Before you can start using the get-all-tickers library, you'll need to install it. This can be done with the pip command as follows in a code cell in your Colab notebook:
Py 34. Install the tickers library Python wrapper

!pip install get-all-tickers
After the successful installation, issue the following line of code in a new code cell to get all tickers from NYSE and NASDAQ stock exchanges:
Py 35. Save all tickers to a dataframe

from get_all_tickers import get_tickers as gt

list_of_tickers = gt.get_tickers(NYSE=True, NASDAQ=True, AMEX=False)
The first thing you might want to do is to check the number of returned tickers:
Py 36. Length of the dataframe with tickers

len(list_of_tickers)

# Output:
# 6144
Then you might want to look at the tickers:
Py 37. Printing the tickers

print(list_of_tickers)
Conclusion
By following the instructions provided in this first part of the series, you should have a Python environment installed on your local machine and have an initial understanding of how to run your Python code in Google Colab. In the next part, you'll start using those environments, obtaining stock data programmatically.

Do you find the article useful?

Do you like the content?
Consider to make a donation
Whether you're grateful for the content, or you wish to support one of the ideas in the wishlist (published on the BuyMeACoffee page)

Leave your feedback on the article

For example, is it easy to understand?
For example, could you run the code?
For example, do you have idea to improve the article ?

Here you'll find the best articles from PythonInvest. Only useful digests, no spam.