1. pytablereader¶
1.1. Summary¶
pytablereader is a Python library to load structured table data from files/strings/URL with various data format: CSV / Excel / Google-Sheets / HTML / JSON / LDJSON / LTSV / Markdown / SQLite / TSV.
1.2. Features¶
- Extract structured tabular data from various data format:
CSV / Tab separated values (TSV) / Space separated values (SSV)
Microsoft Excel TM file
HTML (
table
tags)JSON
Line-delimited JSON(LDJSON) / NDJSON / JSON Lines
Markdown
MediaWiki
SQLite database file
- Supported data sources are:
Files on a local file system
Accessible URLs
str
instances
- Loaded table data can be used as:
pandas.DataFrame instance
dict
instance
2. Installation¶
2.1. Install from PyPI¶
pip install pytablereader
Some of the formats require additional dependency packages, you can install the dependency packages as follows:
- Excel
pip install pytablereader[excel]
- Google Sheets
pip install pytablereader[gs]
- Markdown
pip install pytablereader[md]
- Mediawiki
pip install pytablereader[mediawiki]
- SQLite
pip install pytablereader[sqlite]
- Load from URLs
pip install pytablereader[url]
- All of the extra dependencies
pip install pytablereader[all]
2.2. Install from PPA (for Ubuntu)¶
sudo add-apt-repository ppa:thombashi/ppa
sudo apt update
sudo apt install python3-pytablereader
3. Dependencies¶
3.1. Optional Python packages¶
3.2. Optional packages (other than Python packages)¶
libxml2
(faster HTML conversion)pandoc (required when loading MediaWiki file)