Mercurial > repos > jpayne > tableops
annotate README.MD @ 0:402b58f45844 draft default tip
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
| author | jpayne |
|---|---|
| date | Mon, 08 Dec 2025 15:03:06 +0000 |
| parents | |
| children |
| rev | line source |
|---|---|
|
0
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
1 # Table Ops |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
2 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
3 A collection of simple command-line table manipulation tools written in Python. These tools are designed to be efficient and easy to use for common table operations. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
4 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
5 ## Tools |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
6 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
7 ### `table-union` |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
8 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
9 Merges multiple tabular data files (e.g., CSV, TSV) either by unioning rows with identical columns or by performing a join based on shared key columns. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
10 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
11 **Key Features:** |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
12 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
13 * **Union Mode (Default):** Combines rows from all input files, assuming they have the same columns. Duplicate rows are retained. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
14 * **Join Mode (`--no-union` or similar):** Performs a join operation based on automatically detected shared key columns. It intelligently identifies potential key columns by looking for columns with unique, non-null values across all input files. This mode merges rows based on matching key values. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
15 * **Automatic Key Detection:** Automatically identifies suitable columns for joining based on uniqueness and non-null constraints. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
16 * **Handles various delimiters:** Supports tab-separated (TSV) and comma-separated (CSV) files. |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
17 * **Memory Efficient:** Optimized to handle large files without loading them entirely into memory (where possible). |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
18 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
19 **Usage Example:** |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
20 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
21 ```bash |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
22 table-union file1.tsv file2.tsv file3.tsv > output.tsv |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
23 ``` |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
24 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
25 ```bash |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
26 table-summarize data.tsv |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
27 ``` |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
28 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
29 ```bash |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
30 table-sort -k Age -k Name data.tsv > sorted_data.tsv |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
31 ``` |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
32 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
33 **Run Unit Tests:** |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
34 |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
35 ```bash |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
36 python -m unittest test_table_ops.py |
|
402b58f45844
planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff
changeset
|
37 ``` |
