annotate README.MD @ 0:402b58f45844 draft default tip

planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
author jpayne
date Mon, 08 Dec 2025 15:03:06 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
1 # Table Ops
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
2
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
3 A collection of simple command-line table manipulation tools written in Python. These tools are designed to be efficient and easy to use for common table operations.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
4
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
5 ## Tools
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
6
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
7 ### `table-union`
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
8
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
9 Merges multiple tabular data files (e.g., CSV, TSV) either by unioning rows with identical columns or by performing a join based on shared key columns.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
10
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
11 **Key Features:**
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
12
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
13 * **Union Mode (Default):** Combines rows from all input files, assuming they have the same columns. Duplicate rows are retained.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
14 * **Join Mode (`--no-union` or similar):** Performs a join operation based on automatically detected shared key columns. It intelligently identifies potential key columns by looking for columns with unique, non-null values across all input files. This mode merges rows based on matching key values.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
15 * **Automatic Key Detection:** Automatically identifies suitable columns for joining based on uniqueness and non-null constraints.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
16 * **Handles various delimiters:** Supports tab-separated (TSV) and comma-separated (CSV) files.
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
17 * **Memory Efficient:** Optimized to handle large files without loading them entirely into memory (where possible).
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
18
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
19 **Usage Example:**
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
20
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
21 ```bash
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
22 table-union file1.tsv file2.tsv file3.tsv > output.tsv
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
23 ```
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
24
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
25 ```bash
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
26 table-summarize data.tsv
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
27 ```
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
28
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
29 ```bash
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
30 table-sort -k Age -k Name data.tsv > sorted_data.tsv
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
31 ```
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
32
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
33 **Run Unit Tests:**
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
34
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
35 ```bash
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
36 python -m unittest test_table_ops.py
402b58f45844 planemo upload commit 9cc4dc1db55299bf92ec6bd359161ece4592bd16-dirty
jpayne
parents:
diff changeset
37 ```