Mercurial > repos > jpayne > tableops
comparison README.MD @ 0:402b58f45844 draft default tip
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
| author | jpayne |
|---|---|
| date | Mon, 08 Dec 2025 15:03:06 +0000 |
| parents | |
| children |
comparison
equal
deleted
inserted
replaced
| -1:000000000000 | 0:402b58f45844 |
|---|---|
| 1 # Table Ops | |
| 2 | |
| 3 A collection of simple command-line table manipulation tools written in Python. These tools are designed to be efficient and easy to use for common table operations. | |
| 4 | |
| 5 ## Tools | |
| 6 | |
| 7 ### `table-union` | |
| 8 | |
| 9 Merges multiple tabular data files (e.g., CSV, TSV) either by unioning rows with identical columns or by performing a join based on shared key columns. | |
| 10 | |
| 11 **Key Features:** | |
| 12 | |
| 13 * **Union Mode (Default):** Combines rows from all input files, assuming they have the same columns. Duplicate rows are retained. | |
| 14 * **Join Mode (`--no-union` or similar):** Performs a join operation based on automatically detected shared key columns. It intelligently identifies potential key columns by looking for columns with unique, non-null values across all input files. This mode merges rows based on matching key values. | |
| 15 * **Automatic Key Detection:** Automatically identifies suitable columns for joining based on uniqueness and non-null constraints. | |
| 16 * **Handles various delimiters:** Supports tab-separated (TSV) and comma-separated (CSV) files. | |
| 17 * **Memory Efficient:** Optimized to handle large files without loading them entirely into memory (where possible). | |
| 18 | |
| 19 **Usage Example:** | |
| 20 | |
| 21 ```bash | |
| 22 table-union file1.tsv file2.tsv file3.tsv > output.tsv | |
| 23 ``` | |
| 24 | |
| 25 ```bash | |
| 26 table-summarize data.tsv | |
| 27 ``` | |
| 28 | |
| 29 ```bash | |
| 30 table-sort -k Age -k Name data.tsv > sorted_data.tsv | |
| 31 ``` | |
| 32 | |
| 33 **Run Unit Tests:** | |
| 34 | |
| 35 ```bash | |
| 36 python -m unittest test_table_ops.py | |
| 37 ``` |
