CHANGELOG
NEXT
Features
A helper function –
augur.subsample.get_parallelism– has been added to optimize usage ofaugur subsamplein Snakemake workflows. This is experimental and not yet part of the public API. #1963 (@victorlin)
Bug fixes
33.0.0 (26 January 2026)
Major Changes
Features
filter, frequencies, refine: Added support in metadata for precise date ranges in
YYYY-MM-DD/YYYY-MM-DDformat. #1304 (@victorlin)refine: Added a new option
--keep-idsto keep certain tips in the tree regardless of clock filtering. This allows force-inclusion similar toaugur filter’s--includeoption, and the same file can be used for both. #1768 (@victorlin)augur.io.write_jsonis a new function serving as a replacement foraugur.utils.write_json. Output minification is controlled by two new parametersminifyandminify_threshold_mb. #1943 (@victorlin)
Bug fixes
export v2: Improved the error message that is displayed when a deprecated coloring key is used. #1882 (@corneliusroemer)
export v2:
--no-minify-jsonnow properly overrides any truthy value inAUGUR_MINIFY_JSON. #1943 (@victorlin)export v2: Skip unhashable node attr values with warning message to avoid previously unhandled
TypeError. #1948 (@joverlee521)
32.1.0 (18 November 2025)
Features
augur.io.read_metadata: Added a new parameter
keep_id_as_columnto keep the resolved id column as a column in addition to setting it as the DataFrame index. #1917 (@victorlin)subsample: Filepaths in the config file can now be relative to the config file’s parent directory in addition to the current working directory. Custom directories can also be specified using a new command line option
--search-pathsor environment variableAUGUR_SEARCH_PATHS. #1897 (@victorlin)A helper function –
augur.subsample.get_referenced_files– has been added to optimize usage ofaugur subsamplein Snakemake workflows. This is experimental and not yet part of the public API. #1918 (@victorlin)
Bug fixes
32.0.0 (21 October 2025)
Major Changes
ancestral, translate: These will now error when the length of any reference gene is indivisible by 3, instead of silently padding with N to translate to ‘X’. #1895 (@victorlin)
augur.utils.load_featuresis deprecated and will be removed in a future major version. Users should useaugur.io.load_featuresinstead. #1912 (@victorlin)
Features
augur curate apply-record-annotationswill now warn if an annotation was unnecessary, often indicative of the upstream data being updated. #1893 (@jameshadfield)
31.5.0 (17 September 2025)
A new command,
augur subsample, supports complex subsampling using file-based configuration. See the updated Filtering and Subsampling guide for a comparison withaugur filter. #635 (@victorlin)
31.4.0 (14 August 2025)
Features
schema: Allow parentheses (
()) in gene names. #1819 (@kimandrews)geolocation rules: Add rules to define region per country to ensure that regions are labelled for all countries. This is especially useful for data sources that do not include region in the metadata. #1844 (@joverlee521)
support numpy v2 in addition to v1. #1855 (@corneliusroemer)
support for Python 3.13. #1857 (@corneliusroemer)
tree: Prefer
iqtree3binary overiqtree2andiqtreewhen available. #1875 (@joverlee521)export v2: URLs encoded in metadata (both TSV and node-data JSONs) will be associated with the value in the exported JSON. Given a column/key
<X>then a valid URL in a column/key named<X>__urlwill be automatically used. This allows values to be a clickable link when viewed in Auspice. #1852 (@jameshadfield)
Bug fixes
filter: Improved speed of using
--group-by monthon large datasets. #1845 (@victorlin)merge: Added validation to require at least two sequence inputs for merging, consistent with metadata merging behavior. #1865 (@victorlin)
validate: Send all log messages to
stderr. #1869 (@victorlin)validate: only print the entire merged Auspice config to
stderrwhen there’s a validation error. #1878(@joverlee521)
31.3.0 (3 July 2025)
Features
traits: Added new options
--branch-labelsand--branch-confidenceto export branch labels for nodes which have a corresponding state change. These are useful for creating streamtrees which convey geographic jumps. #1814 (@jameshadfield)filter, merge: Added a new option
--nthreadsto configure parallelism. Right now, it is only passed to SeqKit, but it may be used for other internal optimizations in the future. #1833 (@victorlin)filter: Added a new option
--skip-checksto bypass checks for duplicates in sequences and whether ids in metadata have a sequence entry. Mainly useful when working with larger files. #1833 (@victorlin)Added a new
AUGUR_PROFILEenvironment variable. If set, Augur will run with Python’s cProfile profiler and save results to the value which should be a file path. This may result in slightly slower run times, and should only be used for debugging purposes. #1835 (@victorlin)
Bug fixes
filter, merge: Improved run time of sequence I/O operations, especially in the common use case of having a workflow manager run multiple invocations simultaneously. #1833 (@victorlin)
filter, merge: Previously, SeqKit was hardcoded to use its default of 4 threads per command, which could have resulted in oversubscription of resources in the common use case of having a workflow manager run multiple invocations simultaneously. The default behavior has been updated to use 1 thread per command to discourage oversubscription of resources. It is configurable with the new
--nthreadsoption described above. #1833 (@victorlin)
31.2.1 (12 June 2025)
Bug fixes
curate format-dates: Removed redundant warning messages that were previously displayed when using
--failure-reporting "warn". #1816 (@victorlin)filter: Improved performance of
--output-sequencesby using SeqKit internally. #1794 (@victorlin)filter: Improved performance when using
--sequenceswithout--sequence-indexby skipping indexing of--sequenceswhen no sequence-based filters are used. #1827 (@victorlin)filter: Fixed a bug that prevented proper checking of duplicates and sequence index mismatches on VCF inputs. #1826 (@victorlin)
merge: Fixed a performance bug where input sequence file validation unnecessarily loaded file contents into device memory. #1820 (@victorlin)
refine: Fixed a bug where inferred dates were being wrongly marked as not inferred. #1829 (@victorlin)
31.2.0 (5 June 2025)
Features
Bug fixes
Added a missing redirect for the environment variables documentation page from its previous location. #1812 (@tsibley)
31.1.0 (27 May 2025)
Features
schema: Allow full stop character (
.) in gene names. #955 (@jameshadfield)
Bug fixes
31.0.0 (19 May 2025)
Major Changes
augur mask --mask,augur tree --exclude-sites: BED files with inconsistent CHROM values (i.e., values in the first column of data lines) will throw an error, as Augur (implicitly) expects to be working on a single piece of DNA (chromosome, segment, etc), and multiple CHROM values in a BED file indicate a violation of this expectation. This is a breaking change. #945 (@genehack)filter: Empty values in the metadata id column will result in an error that can only be resolved by editing the metadata file or by specifying a different id column with
--metadata-id-columns. #1807 (@joverlee521)
Bug fixes
augur mask --mask,augur tree --exclude-sites: Providing an empty BED file, or one with only header lines and no data lines, will no longer cause an error to be thrown. #945 (@genehack)augur.utils.read_bed_file()was rewritten for increased compliance with the BED file specification. In particular, header line dectection is improved and multiple header lines are now supported. #945 (@genehack)export v2: Improved the error message that is displayed when the metadata index column has duplicated values #1791 (@genehack)
tree: Improved help text for
--tree-builder-argsto explain some IQ-TREE options won’t work because of defline rewriting #875 (@genehack)export v2: Automatically rename fields within the
filtersandcoloringsconfigs of the provided auspice config file to match the renamed fields in the exported nodes. #1804 (@joverlee521)export v2: Divergence values are now exported with increased precision, showing up to 6 significant digits instead of 3. #1801 (@rneher)
30.0.1 (28 April 2025)
Bug fixes
filter: Removed the note that appeared in output when running with
--sequencesand without--sequence-index. The help text of both options has been updated to clarify the relationship between the two. #1797 (@victorlin)
30.0.0 (15 April 2025)
Major Changes
Note: The following breaking changes were effective as of version 29.1.0.
filter: Date values in
<year>-<month>format with more than 4 digits in the year (e.g.02025-04) or more than 2 digits in the month (e.g.2025-004) are no longer supported. Support for these was unintentional, but it worked in practice. #1786 (@victorlin)filter: Date values in
<year>-<month>-<day>format that fall outside of valid date boundaries now fail with an error. For example,2025-00-01is invalid. Previously, all date parts were treated categorically without date validation somonth=0was its own category. #1786 (@victorlin)filter: Date values in
<year>-<month>format that fall outside of valid date boundaries are now auto-converted to the closest date. For example,2025-00will be auto-converted to2025-01. Previously, all date parts were treated categorically without date validation somonth=0was its own category. It will now be treated asmonth=1. This is a side-effect of the change in 29.1.0 that switched to the same internal date parsing function that is used by other commands. A future major version may change behavior to fail with an error to better align with handling of<year>-<month>-<day>. #1774 (@victorlin)
Bug fixes
filter: version 29.1.0 inadvertently dropped support for date values in
<year>-<month>or<year>-<month>-<day>format that are not inYYYY-MMorYYYY-MM-DDformat. Support for some values has been restored. See the “Major Changes” section for details on which values are explicitly no longer supported. #1785 (@victorlin)
29.1.0 (10 April 2025)
Features
export v2: Allow multiple auspice config files (
--auspice-config) which are merged together. Note that the merging of lists extends the original list, although elements representing the same data are overwritten instead. You can optionally write out this merged config via--output-auspice-configfor debugging purposes. #1756 (@jameshadfield)
Bug fixes
titers: Improve error messages when titer models do not have enough data. #1769 (@huddlej)
align: Remove extra logs for insertions since the coordinates are output the *.insertions.csv. #1772 (@joverlee521)
filter: Fixed an error with weighted sampling by
year. #1776 (@victorlin)filter: Previously, subsampling with
--group-byyearormonthwould crash on numeric dates. This has been fixed by switching to the same internal date parsing function that is used by other commands. #1774 (@victorlin)filter: Made a small adjustment to use pandas’s
"string"dtype alias when processing values in metadata. #1782 (@victorlin)filter: Options
--outputand-ohave been deprecated and will now show a warning message. See DEPRECATED.md for details. #1622 (@victorlin)Updated outdated documentation on supported date formats in metadata. #882 (@victorlin)
29.0.0 (26 February 2025)
Major Changes
Updated default latitudes and longitudes for geography traits that includes location name changes. See the pull request for more details. #1744 (@joverlee521)
curate apply-geolocation-rules: Augur’s standard geolocation rules are used by default and rules provided via
--geolocation-rulesare considered custom rules that have precedence over the default rules. The--no-default-rulesflag can be used to ignore the default rules. See the pull request for more details. #1745 (@joverlee521)augur.utils.read_strainshas been removed as it’s been deprecated since January 2024. The same function is available through the public API asaugur.io.read_strains. #1749 (@joverlee521)Bumped minimum Python version to 3.9 as support for 3.8 was dropped in Augur v27.0.0. #1763 (@joverlee521)
Features
refine: Added a
--remove-outgroupflag which can be used when rooting a tree on a single taxon. Rooting and removal of outgroup will be performed before any temporal inference, if applicable. #1751 (@jameshadfield)Added standard geolocation rules in “augur/data/geolocation_rules.tsv” that can be used with
augur curate apply-geolocation-rules. #1744 (@joverlee521)[refine, export] Ambiguous dates (e.g. those with “XX” in the date string) are now exported in the Auspice JSON, and all tips now have an additional “inferred” boolean property. These changes only apply to temporal trees. #1760 (@jameshadfield)
Bug fixes
Certain strain names would be silently renamed by
augur tree [--method iqtree]. We now avoid such renaming wherever possible and in cases where there are backslashes or single quotes we now raise a fatal error. Note that names with spaces in the FASTA header (description line) continue to be modified such that everything after the first space is not used in the resulting tree. #1750 (@jameshadfield)Fixed the error that occurred when running
augur curate --help. #1755 (@joverlee521)
28.0.1 (10 February 2025)
Bug Fixes
28.0.0 (30 January 2025)
Major Changes
export v2: The string “none” is now an invalid value for
--color-by-metadataand--metadata-columnsoptions and will be ignored to prevent clashes with Auspice’s internal use of “none”. #1113 (@joverlee521)schema: The string “none” is now an invalid branch label, node_attr key, and coloring key. #1113 (@joverlee521)
curate apply-geolocation-rules: The geolocation rule matching has been updated to be case-insensitive. Use the new
--case-sensitiveflag if you want to revert to the previous behavior of case-sensitive matching. #1740, #1741 (@joverlee521)augur.io.read_sequences: Only accept the values"fasta"and"genbank"for format, instead of allowing any value supported by Biopython. #1731 (@victorlin)This also applies to
augur.io.sequences.read_single_sequence, which is not in the public API.
Features
All commands: Support compressed formats for input sequence files. This was already the case for most commands. Internal standardization extends the support to all other commands. #1730 (@victorlin)
Bug Fixes
When using >=Biopython 1.85: properly detect
augur ancestral --root-sequencefile format and, for all commands, support FASTA files with comments. #1731 (@victorlin)
Internal changes
Added a new function
augur.io.sequences.read_single_sequenceas a wrapper aroundBio.SeqIO.readwith support for compressed formats, similar to theaugur.io.sequences.read_sequenceswrapper aroundBio.SeqIO.parse. #1730 (@victorlin)
27.2.0 (22 January 2025)
Features
export: Added a new option
--warningto display a warning banner in Auspice, supported as of Auspice version 2.62.0. #1722 (@victorlin)
27.1.0 (15 January 2025)
Features
ancestral: Add
--seedargument to enable deterministic inference of root states by TreeTime. #1690 (@huddlej)
Bug Fixes
ancestral, refine: Explicitly specify how the root and ambiguous states are handled during sequence reconstruction and mutation counting. #1690 (@rneher)
titers: Fix type errors in code associated with cross-validation of models. #1688 (@huddlej)
export: The help text for
--lat-longshas been improved with a link to the defaults and specifics around the overriding behavior. #1715 (@victorlin)augur.io.read_metadata: Pandas versions <1.4.0 prevented this function from properly setting the index column’s data type. Support for those older versions has been dropped. #1716 (@victorlin)
In version 24.4.0, one of the new features was that all options that take multiple values could be repeated. Unfortunately, it overlooked a few that have been fixed in this version. #1707 (@victorlin)
augur curate rename --field-mapaugur curate transform-strain-name --backup-fields
augur curate format-dates --expected-date-formatshelp text has been improved with clarifications regarding how values provided interact with builtin formats and how to match masked date parts. #1707, #1718 (@victorlin)parse: Transform strain names the same way in both metadata and sequences instead of only transforming sequences. #1712 (@huddlej)
27.0.0 (9 December 2024)
Major Changes
Bug fixes
export: validation will no longer crash with
KeyError: 'tree'when newer versions of jsonschema (≥4.18.0) are installed. #1358 (@victorlin)
26.2.0 (20 November 2024)
Features
This is the first version to officially support Python 3.12 and Pandas v2. #1671 #1678 (@corneliusroemer, @victorlin)
curate: change output metadata to RFC 4180 CSV-like TSVs to match the TSV format output by other Augur subcommands and the Nextstrain ecosystem as discussed in #1566. #1565 (@joverlee521)
26.1.0 (12 November 2024)
Features
ancestral, translate: Add
--skip-validationas an alias to--validation-mode=skip. #1656 (@victorlin)clades: Allow customizing the validation of input node data JSON files with
--validation-modeand--skip-validation. #1656 (@victorlin)tree: When using iqtree, check for all synonyms of default args when detecting potential conflicts, e.g.
--threads-maxis equivalent to-ntmax. Previously, we were only checking for the latter. Also use new, preferred IQtree2 option names (e.g.--polytomyinstead of-czbetc.). #1547 (@corneliusroemer)
Bug Fixes
index: Previously specifying a directory that does not exist in the path to
--outputwould result in an incorrect error stating that the input file does not exist. It now shows the correct path responsible for the error. #1644 (@victorlin)curate format-dates: Update help docs and improve failure messages to show use of
--expected-date-formats. #1653 (@joverlee521)parse: fix test failure with pandas 2.2. #1471 (@emollier)
26.0.0 (17 September 2024)
Major Changes
filter: Duplicate header names in the FASTA file (
--sequences) will now result in an error. #1613 (@victorlin)parse: When both
strainandnamefields are present, thestrainfield will now be used as the sequence ID field. #1629 (@victorlin)merge: Generated source columns (e.g.
__source_metadata_{NAME}) are now omitted by default. They may be explicitly included with--source-columns=TEMPLATEor explicitly omitted with--no-source-columns. This may be a breaking change for any existing uses ofaugur mergerelying on the generated columns, though asaugur mergeis relatively new we believe usage to be scant if extant at all. #1625 #1632 (@tsibley)
Bug Fixes
25.4.0 (3 September 2024)
Features
merge: Table-specific id columns and delimiters may now be specified, e.g.
--metadata-id-columns X=id Y=strainand--metadata-delimiters X=, Y=';', to allow more precise behaviour and avoid ordering issues. #1594 (@tsibley)
Bug Fixes
filter: Improved warning and error messages in the case of missing columns. #1604 (@victorlin)
merge: Any user-customized
~/.sqlitercfile is now ignored so it doesn’t breakaugur merge’s internal use of SQLite. #1608 (@tsibley)merge: Non-id columns in metadata inputs that would conflict with the output id column are now forbidden and will cause an error if present. Previously they would overwrite values in the output id column, causing incorrect output. #1593 (@tsibley)
import: Spaces in BEAST MCC tree annotations (for example, from a discrete state reconstruction) no longer break
augur import beast’s parsing. #1610 (@watronfire)
25.3.0 (22 August 2024)
Features
A new command,
augur merge, now allows for generalized merging of two or more metadata tables. #1563 (@tsibley)Two new commands,
augur read-fileandaugur write-file, now allow external programs to do i/o like Augur by piping from/to these new commands. They provide handling of compression formats and newlines consistent with the rest of Augur. #1562 (@tsibley)A new debugging mode can be enabled by setting the
AUGUR_DEBUGenvironment variable to1(or any non-empty value). Currently the only effect is to print more information about handled (i.e. anticipated) errors. For example, stack traces and parent exceptions in an exception chain are normally omitted for handled errors, but setting this env var includes them. Future debugging and troubleshooting features, like verbose operation logging, will likely also condition on this new debugging mode. #1577 (@tsibley)filter: Added the ability to use weights in subsampling. See help text of
--group-by-weightsand the updated Filtering and Subsampling guide for more information. #1454 (@victorlin)
Bug Fixes
Embedded newlines in quoted field values of metadata files read/written by many commands, annotation files read by
augur curate apply-record-annotations, and index files written byaugur indexare now properly handled. #1561 #1564 (@tsibley)Output written to stderr (e.g. informational messages, warnings, errors, etc.) is now always line-buffered regardless of the Python version in use. This helps with interleaved stderr and stdout. Previously, stderr was block-buffered on Python 3.8 and line-buffered on 3.9 and higher. #1563 (@tsibley)
25.2.0 (24 July 2024)
Features
export v2: we now limit numerical precision on floats in the JSON. This should not change how a dataset is displayed / interpreted in Auspice but allows the gzipped & minimised JSON filesize to be reduced by around 30% (dataset-dependent). #1512 (@jameshadfield)
traits, export v2:
augur traitsnow reports all confidence values above 0.1% rather than limiting them to the top 4 results. There is no change in the eventual Auspice dataset asaugur export v2will still only consider the top 4. #1512 (@jameshadfield)curate: Excel (
.xlsxand.xls) and OpenOffice (.ods) spreadsheet files are now also supported as metadata inputs (--metadata). The first sheet in the workbook is read as tabular data. #1550 (@tsibley)
Bug Fixes
titers sub: Fixes a bug where antigenic weights were assigned to branches for substitutions in the incorrect order of
<derived allele><position><ancestral allele>instead of<ancestral allele><position><derived allele>. #1555 (@huddlej)
25.1.1 (15 July 2024)
Bug Fixes
curate parse-genbank-location: Fix a bug where a mix of empty and populated location-field values would result in inconsistent fields in the output NDJSON #1531(@genehack)
25.1.0 (11 July 2024)
Features
25.0.0 (10 July 2024)
Major changes
Features
Added a new sub-command
augur curate apply-geolocation-rulesto apply user curated geolocation rules to the geolocation fields in a metadata file. Previously, this was available as a script within the nextstrain/ingest repo. #1491 (@victorlin)Added a default color for the “Asia” region that will be used in
augur exportis no custom colors are provided. #1490 (@joverlee521)Added a new sub-command
augur curate apply-record-annotationsto apply user curated annotations to existing fields in a metadata file. Previously, this was available as amerge-user-metadatain the nextstrain/ingest repo. #1495 (@joverlee521)Added a new sub-command
augur curate abbreviate-authorsto abbreviate lists of authors to “et al.” Previously, this was avaliable as the transform-authorsscript within the nextstrain/ingest repo. [#1483][] (@genehack)Added a new sub-command
augur curate parse-genbank-locationto parse thegeo_loc_namefield from GenBank reconds. Previously, this was available as thetranslate-genbank-locationscript within the nextstrain/ingest repo. [#1485][] (@genehack)curate format-dates: Added defaults to
--expected-date-formatsso that ISO 8601 dates (%Y-%m-%d) and its various masked forms (e.g.%Y-XX-XX) are automatically parsed by the command. #1501 (@joverlee521)Added a new sub-command
augur curate transform-strain-nameto filter strain names based on matching a regular expression. Previously, this was available as thetransform-strain-namesscript within the nextstrain/ingest repo. #1514 (@genehack)Added a new sub-command
augur curate renameto rename field / column names. Previously, a similar version was available as thetransform-field-namesscript within the nextstrain/ingest repo however the behaviour is slightly changed here. #1506 (@jameshadfield)
Bug Fixes
filter: Improve speed of checking duplicates in metadata, especially for large files. #1466 (@victorlin)
curate: Stop adding double quotes to the metadata TSV output when field values have internal quotes. #1493 (@joverlee521)
curate format-dates: Mask empty date values as
XXXX-XX-XXto represent unknown dates. #1509 (@joverlee521)
24.4.0 (15 May 2024)
Features
All commands: Allow repeating an option that takes multiple values. Previously, if multiple option flags were specified (e.g.
--exclude-where 'region=A' --exclude-where 'region=B'), only the last one was used. Now, all values are used. #1445 (@victorlin)ancestral, translate: output node data files are now validated. The argument
--validation-modeis added which controls this behaviour (default: error). This argument also controls validation of the input node-data file (ancestral only). #1440 (@jameshadfield)export: Updated default latitudes and longitudes for geography traits. This only applies if you are not using
--lat-longsto override the built in mappings. #1449 (@trvrb)
Bug Fixes
validation: we no longer exit with a non-zero exit code when the requested validation mode is “warn” #1440 (@jameshadfield)
validation: we no longer perform any validation when the requested validation mode is “skip” #1440 (@jameshadfield)
filter: Send all log messages to
stderr. This allows output to be written tostdout(e.g.--output-strains /dev/stdout). #1459 (@victorlin)
24.3.0 (18 March 2024)
Features
Bug Fixes
filter: Updated docs with an example of tiered subsampling. #1425 (@victorlin)
export: Fixes bug #1433 introduced in v23.1.0, that causes validation to fail when gene names start with
nuc, e.g.nucleocapsid. #1434 (@corneliusroemer)import: Fixes bug introduced in v24.2.0 that prevented
import beastfrom running. #1439 (@tomkinsc)translate, ancestral: Compound CDS are now exported as segmented CDS and are now viewable in Auspice. #1438 (@jameshadfield)
24.2.3 (23 February 2024)
Bug Fixes
filter: Updated the help and report text of
--min-lengthto explicitly state that the minimum length filter only counts standard nucleotide characters A, C, G, or T (case-insensitive). This has been the behavior since version 3.0.3.dev1, but has never been explicitly documented. #1422 (@joverlee521)frequencies: Fixed a bug introduced in 24.2.0 and 24.1.0 that prevented
--regionsfrom working when providing regions other than the default “global” region. #1424
24.2.2 (16 February 2024)
Bug Fixes
filter: In versions 24.2.0 and 24.2.1,
--querystopped working in cases where internal optimizations added in version 24.2.0 failed to parse the columns from the query. It now falls back to non-optimized behavior that allows queries to work. #1418 (@victorlin)filter: Handle backtick quoting in internal optimizations of
--query. #1417 (@victorlin)
24.2.1 (14 February 2024)
Bug Fixes
frequencies: Fixed a bug introduced in 24.2.0 that prevented
--method diffusionfrom working alongside--tree. #1412 (@victorlin)
24.2.0 (12 February 2024)
Features
filter: Added a new option
--query-columnsthat allows specifying what columns are used in--queryalong with the expected data types. If unspecified, automatic detection of columns and types is attempted. #1294 (@victorlin)augur.io.read_metadata: A new optionalcolumnsargument allows specifying a subset of columns to load. The default behavior still loads all columns, so this is not a breaking change. #1294 (@victorlin)augur parse: A new optional--output-id-fieldargument allows the user to select any ID field for the produced FASTA file (e.g. ‘accession’ instead of ‘name’ or ‘strain’). #1403 (@j23414)When no
--output-id-fieldis given and the data has bothnameandstrainfields, continue to preferentially usenameoverstrainas the sequence ID field; but, throw a deprecation warning that the order will be switched to preferstrainovernamein the future to be consistent with the rest of Augur.Added entry to DEPRECATED.md.
Compression should now be supported for all input and output files. Please open an issue if you find one that doesn’t! #1381 (@victorlin)
export v2: Add support to specify metadata columns to export without using them as colorings. This can be done with the
metadata_columnsproperty in the Auspice config JSON or via the--metadata-columnsflag in the command line. #1384 (@joverlee521)
Bug Fixes
filter: In version 24.1.0, automatic conversion of boolean columns was accidentally removed. It has been restored with additional support for empty values evaluated as
None. #1410 (@victorlin)filter: The order of rows in
--output-metadataand--output-strainsnow reflects the order in the original--metadata. #1294 (@victorlin)filter, frequencies, refine: Performance improvements to reading the input metadata file. #1294 (@victorlin)
For filter, this comes with increased writing times for
--output-metadataand--output-strains. However, net I/O speed still decreased during testing of this change.
filter: Updated the help text of
--includeand--include-whereto explicitly state that this can add strains that are missing an entry from--sequences. #1389 (@victorlin)filter: Fixed the summary messages to properly reflect force-inclusion of strains that are missing an entry from
--sequences. #1389 (@victorlin)filter: Updated wording of summary messages. #1389 (@victorlin)
Enforce UTF-8 encoding when reading and writing files. Improve error messages when a non-UTF-8 file is used. #1381 (@victorlin)
24.1.0 (30 January 2024)
Features
augur.io.read_metadata: A new optionaldtypeargument allows custom data types for all columns. Automatic type inference still happens by default, so this is not a breaking change. #1252 (@victorlin)augur.io.read_vcfhas been removed and usage replaced with TreeTime’s function of the same name which has improved validation of the VCF file. #1366 (@jameshadfield)
Bug Fixes
filter, frequencies, refine: Speed up reading of the metadata file. #1252 (@victorlin)
traits: Previously, columns with only numeric values were treated as numerical data. These are now treated as categorical data for discrete trait analysis. #1252 (@victorlin)
Support Biopython
≥1.82by requiring bcbio-gff≥0.7.1. #1400 (@victorlin)
24.0.0 (22 January 2024)
Major Changes
ancestral, translate: For VCF inputs please ensure you are using TreeTime 0.11.2 or later. A large number of bugfixes and improvements have been added in both Augur and TreeTime. #1355 and TreeTime #263 (@jameshadfield)
ancestral, translate: GenBank files now require the (GFF mandatory) source feature to be present. #1351 (@jameshadfield)
ancestral, translate: For GFF files, we extract the genome/sequence coordinates by inspecting the sequence-region pragma, region type and/or source type. This information is now required. #1351 (@jameshadfield)
Features
ancestral, translate: Improvements to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
Output VCF will better match the input VCF, including CHROM name and ploidy encoding.
VCF inputs now require
--vcf-reference-outputAA sequences are now exported for the tree root
VCF writing is now 3 orders of magnitude faster (dataset dependent)
ancestral, translate: A range of improvements to how we parse GFF and GenBank reference files. #1351 (@jameshadfield)
translate will now always export a ‘nuc’ annotation in the output JSON, allowing it to pass validation
Gene/CDS names of ‘nuc’ are now forbidden.
If a Gene/CDS in the GFF/GenBank file is unparsed we now print a warning.
ancestral: For VCF alignments, a VCF output file is now only created when requested via
--output-vcf. #1344 (@jameshadfield)ancestral: Improvements to command line arguments. #1344 (@jameshadfield)
Incompatible arguments are now checked, especially related to VCF vs FASTA inputs.
--vcf-referenceand--root-sequenceare now mutually exclusive.
translate: Tree nodes are checked against the node-data JSON input to ensure sequences are present. #1348 (@jameshadfield)
utils::load_features: This function may now raise
AugurError. #1351 (@jameshadfield)export v2: Automatically minify large outputs. Use
--no-minify-jsonto disable this default behavior. #1352 (@victorlin)Added a new file DEPRECATED.md to document timelines and progress of deprecated features in the Augur CLI and Python API. #1371 (@victorlin)
Bug Fixes
ancestral, translate: Various fixes to VCF inputs / outputs. #1355 and TreeTime #263 (@jameshadfield)
Fix incorrect (but passing) tests
Fix case-sensitive sequence comparisons between the root and reference sequences.
Fix a bug where ambiguous alleles are not inferred (see #1380 for full details).
Fix a bug where positions with no sequence information were assigned a base because the mask was not being computed (see #1382 for full details).
More than one ALT allele is now correctly parsed
Mutations followed by an insertion are now parsed
Unchanged ref genotypes are now encoded as ‘0’ rather than ‘.’
ALT alleles “*” are now valid (introduced in VCF spec 4.2, but observed in VCF 4.1 files)
Positions with no variation are no longer exported
ancestral, translate: Fixes for JSON (non-VCF) inputs. #1355 (@jameshadfield)
ancestral, translate: Avoid incompatibilities with Biopython >=1.82. #1374, #1387 (@victorlin)
ancestral, translate: Address Biopython deprecation warnings. #1379 (@victorlin)
ancestral: Previously, the help text for
--genesfalsely claimed that it could accept a file. Now, it can truly claim that. #1353 (@victorlin)translate: The ‘source’ ID for GFF files is now ignored as a potential gene feature (it is still used for overall nuc coords). #1348 (@jameshadfield)
translate: Improvements to command line arguments. #1348 (@jameshadfield)
--treeand--ancestral-sequencesare now required arguments.separate VCF-only arguments into their own group
translate: Fixes a bug in the parsing behaviour of GFF files whereby the presence of the
--genescommand line argument would change how we read individual GFF lines. Issue #1349, PR #1351 (@jameshadfield)If
TreeTimeErroris encountered Augur now exits with code 2 rather than 0. (This restores the original behaviour.) #1367 (@jameshadfield)Deprecate
read_strainsfromaugur.utilsand add it to the public API underaugur.io. #1353 (@victorlin)
23.1.1 (7 November 2023)
Bug Fixes
23.1.0 (22 September 2023)
Features
Support treetime 0.11.* #1310 (@corneliusroemer)
export: Allow minimal export using only a (newick) tree in
augur export v2. #1299 (@jameshadfield)A number of schema updates and improvements #1299 (@jameshadfield)
We now require all nodes to have
node_attrson them with one ofdivornum_datepresentSome never-used properties are removed from the schemas, including a pattern for defining nucleotide INDELs which was never used by augur or auspice.
Tip label defaults are now settable within the auspice-config JSON
Empty colorings definitions are allowed (the tree will be grey in Auspice)
Bug fixes
ancestral: Export amino acid sequences inferred for the root node of the tree in the node data JSON output for compatibility with
augur translateoutput. #1317 (@huddlej)
23.0.0 (5 September 2023)
Major Changes
Drop support for Python 3.7. #1296 (@victorlin)
Features
export v2: Allow the root-sequence data to be included (inlined) in the main dataset JSON file, avoiding the need for a sidecar
_root-sequence.jsonfile. #1295 (@jameshadfield)
22.4.0 (29 August 2023)
Features
refine: Export covariance matrix and standard deviation for clock rate regression in the node data JSON output when these values are calculated by TreeTime. These new values appear in the
clockdata structure of the JSON output ascovandrate_stdkeys, respectively. #1284 (@huddlej)
Bug fixes
clades: Fix outputs for genes named
NA(previously the value was replaced bynan). #1293 (@rneher)distance: Improve documentation by describing how gaps get treated as indels and how users can ignore specific characters in distance calculations. #1285 (@huddlej)
Fix help output compatibility with non-Unicode streams. #1290 (@victorlin)
22.3.0 (14 August 2023)
Features
ancestral: add functionality to reconstruct ancestral amino acid sequences and add inferred mutations to the
node_data_jsonwith output equivalent toaugur translate.ancestralnow takes an annotation (--annotation), a list of genes (--genes), and a file name pattern for amino acid alignments (--translations). Mutations for each of these genes will be inferred and added to the output JSON to each node as a list at['aa_muts'][gene]. The annotations will be added to theannotationfield in the output JSON. Inferred amino acids sequences can be saved with the new--output-translationsargument. #1258 (@rneher, @huddlej)ancestral: add the ability to report mutations relative to a sequence other than the inferred root of the tree. This sequence can be specified via
--root-sequenceand difference between this sequence and the inferred root of the tree will be added as mutations to the root node for nucleotides and amino acids. All differences between the specifiedroot-sequenceand the inferred sequence of the root node of the tree will be added as mutations to the root node. This was previously already possible forvcfinput via--vcf-reference. #1258 (@rneher)refine: add
mid_pointas rooting option torefine. #1257 (@rneher)
Bug fixes
filter: In version 22.2.0,
--querywould fail when the.straccessor was used on a column. This has been fixed. #1277 (@victorlin)
22.2.0 (31 July 2023)
Features
Adds a new sub-command augur curate titlecase. The titlecase command is intended to apply titlecase to string fields in a metadata record (e.g. BRAINE-LE-COMTE, FRANCE -> Braine-le-Comte, France). Previously, this was available in the transform-string-fields script within the monkeypox repo. #1197 (@j23414 and @joverlee521)
Bug fixes
export v2: Previously, when
strainwas not used as the metadata ID column, node attributes might have gone missing from the final Auspice JSON. This has been fixed. #1260, #1262 (@victorlin, @joverlee521)export v1: Added a deprecation warning for this command. #1265 (@victorlin)
export v1: The recently introduced flag
--metadata-id-columnsdid not work properly due to the sameexport v2bug that was fixed in this release. Instead of fixing it inexport v1, drop the broken feature since this command is no longer being maintained. #1265 (@victorlin)filter: Expose internal Pandas errors from
--querywhich may be useful to users. #1267 (@victorlin)filter: Previously,
--querywould fail when numerical comparisons were used on columns with missing values. This has been fixed. #1269 (@victorlin)
22.1.0 (10 July 2023)
Features
export, frequencies, refine, traits: Add a new flag
--metadata-id-columnsto customize the possible metadata ID columns. Previously, this was only available inaugur filter. #1240 (@victorlin)Add new sub-subcommand augur curate format-dates. The format-dates command is intended to be used to format date fields to ISO 8601 date format (YYYY-MM-DD), where incomplete dates are masked with
XX(e.g. 2023 -> 2023-XX-XX). #1146 (@joverlee521)
Bug fixes
parse: Fix a bug where
--fix-dateswas always applied, with a default of--fix-dates=monthfirst. Now, running without--fix-dateswill leave dates as-is. #1247 (@victorlin)augur.io.open_file: Previously, the docs described a type restriction onpath_or_bufferbut it was not enforced. It has been updated to allow all I/O classes, and is enforced at run-time. #1250 (@victorlin)filter: Fix a bug where data files consisting of only numerical strain names would not work when both
--metadataand--sequencesare passed. #1256 (@victorlin)
22.0.3 (14 June 2023)
Bug fixes
utils: Serialize pandas Series in
write_json. #1213 (@victorlin)
22.0.2 (26 May 2023)
Bug fixes
CI: Add a Github action to test augur on 8 Nextstrain pathogen workflows using example data. #1217 (@corneliusroemer)
parse: Denote required arguments including
--fields,--output-sequences, and--output-metadata. #1228 (@huddlej)Fix export of the
strandattribute of gene annotations. Previously, features on the negative strand were not annotated as such since the code assumed that thestrandattribute was boolean instead of[-1, +1]. #1211 @rneher and @j23414.augur.io.read_metadata: explicitly set
datecolumn asstringtype to prevent year only dates from being inferred as integers. #1235 (@joverlee521)
22.0.1 (16 May 2023)
Bug fixes
export: No longer export duplicate entries in the colorings array, a bug which has been present in Augur since at least v12 #719. #1218 (@jameshadfield)
export: In version 22.0.0, some configurations of export may have resulted in the clade coloring appearing last in the Auspice dropdown rather than first. This is now fixed. #1218 (@jameshadfield)
export: In version 22.0.0, validation of
augur.utils.read_node_datawas changed to error when a node data JSON did not contain any actual data. This causes export to error when an empty node data JSON is passed, as for example in ncov’s pathogen-ci. This is now fixed by warning instead. The bug was originally introduced in PR #728. #1214 (@corneliusroemer)
22.0.0 (9 May 2023)
Major Changes
export, filter, frequencies, refine, traits: From versions 10.0.0 through 21.1.0, arbitrary delimiters for
--metadatawere supported due to internal implementation differences from the advertised CSV and TSV support. Starting with this version, non-CSV/TSV files will no longer be supported by default. To adjust for this breaking change, specify custom delimiters with the new--metadata-delimitersflag. #1196 (@victorlin)augur.io.read_metadata: Previously, this supported any arbitrary delimiters for the metadata. Now, it only supports a list of possible delimiters represented by the newdelimiterskeyword argument, which defaults to,and\t. #812 (@victorlin)refine: The seeding method for
--seedhas been updated. This affects usages that rely on the reproducibility of outputs with the same--seedvalue prior to this version. Outputs from this version onwards should be reproducible until the next implementation change, which we don’t expect to happen any time soon. #1207 (@rneher)
Features
Constrain
bcbio-gffto >=0.7.0 and allowBiopython>=1.81 again. We had to introduce theBiopythonconstraint in v21.0.1 (see #1152) due tobcbio-gff<0.7.0 relying on the removedBiopythonfeatureUnknownSeq. #1178 (@corneliusroemer)augur.io.read_metadata(used by export, filter, frequencies, refine, and traits): Previously, this used the Python parser engine forpandas.read_csv(). Updated to use the C engine for faster reading of metadata. #812 (@victorlin)curate: Allow custom metadata delimiters with the new
--metadata-delimitersflag. #1196 (@victorlin)Bump the default recursion limit to 10,000. Users can continue to override this limit with the environment variable
AUGUR_RECURSION_LIMIT. #1200 (@joverlee521)clades, export v2: Clade labels + coloring keys are now definable via arguments to augur clades allowing pipelines to use multiple invocations of augur clades resulting in multiple sets of colors and branch labels. How labels are stored in the (intermediate) node-data JSON files has changed. This should be fully backwards compatible for pipelines using augur commands, however custom scripts may need updating. PR #728 (@jameshadfield)
refine: add flag
--max-iterto control the maximal number of iterations TreeTime uses to infer time trees. This was previously hard-coded to 2, which is now the default. #1203 (@rneher)refine: add flags
--greedy-resolveand--stochastic-resolveto customize polytomy resolution. #1203, #1207 (@rneher)--greedy-resolve: resolve polytomies by greedily minimizing tree length (default behavior, unchanged).--stochastic-resolve: resolve polytomies as random coalescent trees.These are mutually exclusive with the pre-existing
--keep-polytomiesflag.
Bug fixes
filter, frequencies, refine, parse: Previously, ambiguous dates in the future had a limit of today’s date imposed on the upper value but not the lower value. It is now imposed on the lower value as well. #1171 (@victorlin)
refine:
--year-boundswas ignored in versions 9.0.0 through 20.0.0. It now works. #1136 (@victorlin)tree: Input alignment filenames which do not end in
.fastaare now properly handled when using IQ-TREE. Previously their contents were overwritten first byaugur treeitself (resulting in truncation) and then by the log output of IQ-TREE (resulting in an error). Thanks to Jon Bråte for reporting this bug. #1206 (@tsibley)clades: A number of small bug fixes, improvements to documentation, tests and improved error detection. #1199 (@jameshadfield)
21.1.0 (14 March 2023)
Features
filter: Add
--empty-output-reporting={error,warn,silent}option to allow filter to produce empty outputs without raising an error. The default behavior is still to raise an error when filter produces an empty output, so users will have to explicitly pass the “warn” or “silent” value to bypass the error. #1175 (@joverlee521)
Bug fixes
translate: Fix error handling when features cannot be read from reference sequence file. #1168 (@victorlin)
translate: Remove an unnecessary check which allowed for inaccurate error messages to be shown. #1169 (@victorlin)
frequencies: Previously, monthly pivot points calculated from the end of a month may have been shifted by 1-3 days. This is now fixed. #1150 (@victorlin)
docs: Fix minor formatting issues. #1095 (@victorlin)
Update development status on PyPI from “3 - Alpha” to “5 - Production/Stable”. This should have been done since the beginning of this changelog, but now it is official. #1160 (@corneliusroemer)
21.0.1 (17 February 2023)
Bug fixes
Constrain Biopython version to <=1.80 so that
augur translateis not broken by a deprecation ofUnknownSeqin 1.81. When runningaugur translatewith Biopython 1.81, the user will receive an error starting withERROR: Package BCBio.GFF not found!and ending withTypeError: object of type 'NoneType' has no len(). #1152 (@corneliusroemer)
21.0.0 (7 February 2023)
Major Changes
measurements export: Supports exporting multiple thresholds per collection via the measurements config and the
--thresholdsoption. This change is backwards compatible with previous uses of the--thresholdoption. However, due to the updates to the JSON schema, users will need to update to Auspice v2.43.0 for thresholds to be displayed properly in the measurements panel. #1148 (@joverlee521)
Features
export v2: Add
--validation-mode={error,warn,skip}option for more nuanced control of validation. The new “warn” mode performs validation and emits messages about potential problems, but it does not cause the export command to fail even if there are problems. #1135 (@tsibley)
Bug Fixes
20.0.0 (20 January 2023)
Major Changes
frequencies: Changes the logic for calculating the time points when frequencies are estimated to ensure that the user-provided “end date” is always included. This change in the behavior of the frequencies command fixes a bug where large intervals between time points (e.g., 3 months) could cause recent data to be omitted from frequency calculations. See the pull request for more details included the scientific implications of this bug. #1121 (@huddlej)
19.3.0 (19 January 2023)
Features
Bug Fixes
utils: Serialize common numpy data types in
write_json. #1119 (@victorlin)filter: Standardize exit codes from internal error handling. #931 (@victorlin)
tree: Suppress the
Cannot specify --substitution-model unless using IQTreewarning when--substitution-modelis left at its default. #1127 (@tsibley)tree: Print the underlying error message when tree building fails. #1127 (@tsibley)
Previously,
numpyandscipywere installed as dependencies of dependencies. Mark them as direct dependencies since they are used directly within Augur. #1120 (@victorlin)
19.2.0 (19 December 2022)
Features
titers: Allow users to specify a custom prefix for attributes in the JSON output (e.g.,
cTitercan be changed tocustom_prefix_cTiter). #1106 (@huddlej)
19.1.0 (14 December 2022)
Features
io: Add
open_fileandwrite_sequencesto the Python Pubic API. #1114 (@joverlee521)
19.0.0 (13 December 2022)
Major Changes
io: Only
read_metadataandread_sequencesare available as part of the Python Public API. Other Python API functions of theaugur.iomodule are no longer directly available. This is a breaking change, although we suspect few users to be impacted. If you still need to use other imports in your scripts, they can be imported from the Developer API but note that they are no longer part of the Public API. #1087 (@victorlin)
Bug Fixes
docs: Update the API documentation to reflect the latest state of things in the codebase. #1087 (@victorlin)
Fix support for Biopython version 1.80 which deprecated
Bio.Seq.Seq.ungap(). #1102 (@victorlin)export v2: Fixed a bug where colorings for zero values via
--colorswould not get applied to the exported Auspice JSON. #1100 (@joverlee521)curate: Fixed a bug where metadata TSVs failed to parse if data within a column included comma separated values #1110 (@joverlee521)
18.2.0 (15 November 2022)
Features
Add the curate subcommand with two sub-subcommands, passthru and normalize-strings. The curate subcommand is intended to be a suite of commands to help users with data curation prior to running Nextstrain analyses. We will continue to add more subcommands as we identify other common data curation tasks. Please see the usage docs for details. #1039 (@joverlee521)
18.1.2 (1 November 2022)
Bug Fixes
traits: Fix trait inference when tips have missing values. #1081 (@huddlej)
18.1.1 (1 November 2022)
Bug Fixes
filter: Fixed a bug where
--group-by weekwould fail when all samples in a chunk have been dropped due to ambiguous dates. #1080 (@victorlin)
18.1.0 (26 October 2022)
Features
filter: Add support to group by ISO week (
--group-by week) during subsampling. #1067 (@victorlin)
Bug Fixes
filter: Fixed unintended behavior in which grouping by
daywould “work” when used withmonthand/oryear. Updated so it will be ignored. #1070 (@victorlin)filter: Fixed unintended behavior in which grouping by
monthwith ambiguous years would “work”. Updated so date ambiguity is checked properly for all generated columns. #1072 (@victorlin)
18.0.0 (21 September 2022)
Major Changes
export: The
--node-dataoption may now be given multiple times to provide additional.jsonfiles. Previously, subsequent occurrences of the option overrode prior occurrences. This is a breaking change, although we expect few usages to be impacted. Each occurrence of the option may still specify multiple files at a time. #1010 (@tsibley)
Bug Fixes
refine: 17.1.0 updated TreeTime to version 0.9.2 and introduced the
refineflag--use-fft. This makes previously costly marginal date inference cheaper. This update adjusts whenrefineruns marginal date inference during its iterative optimization. Without theuse-fftflag, it will now behave as it did before 17.1.0 (marginal inference only during final iterations). With the--use-fftflag, marginal date inference will be used at every step during the iteration if refine is run with--date-inference marginal#1034. (@rneher)tree: When using IQtree as tre builder,
--nthreadsnow sets the maximum number of threads (IQtree argument-ntmax). The actual number of threads to use can be specified by the user through the tree-builder-arg-ntwhich defaults to-nt AUTO, causing IQtree to automatically chose the best number of threads to use #1042 (@corneliusroemer)Make cvxopt as a required dependency, since it is required for titer models to work #1035. (@victorlin)
filter: Fix compatibility with Pandas 1.5.0 which could cause an unexpected
AttributeErrorwith an invalid--querygiven toaugur filter. #1050 (@tsibley)refine: Add
--verbosityargument that is passed down to TreeTime to facilitate monitoring and debugging. #1033 (@anna-parker)Improve handling of errors from TreeTime. #1033 (@anna-parker)
17.1.0 (19 August 2022)
Features
refine: Upgrade TreeTime from 0.8.6 to >= 0.9.2 which enables a speedup of timetree inference in marginal mode due to the use of Fast Fourier Transforms #1018. (@rneher and @anna-parker). Use the
refineflag--use-fftto use this feature.
Bug Fixes
refine, export v1: Use pandas.DataFrame.at instead of .loc for single values #979. (@victorlin)
refine: Gracefully handle all exceptions from TreeTime #1023. (@anna-parker)
refine: Document branch length units
treetimeexpects #1024. (@anna-parker)dates: Raise an error when metadata to
get_numerical_dates()is not a pandas DataFrame #1026. (@victorlin)
17.0.0 (9 August 2022)
Major Changes
Moved the following modules to subpackages #1002. (@joverlee521) These are technically breaking changes for the API, but they do not change the Augur CLI commands.
import.py->import_/__init__.pyimport_beast.py->import_/beast.pymeasurements.py->measurements/__init__.py+measurements/concat.py+measurements/export.py
Move the following internal functions/classes #1002. (@joverlee521)
augur.add_default_command->argparse_.add_default_commandutils.HideAsFalseAction->argparse_.HideAsFalseAction
Subcommands must include a
register_parserfunction to add their own parser instead of aregister_argumentsfunction #1002. (@joverlee521)utils: Remove internal function
utils.read_metadata()#978. (@victorlin)Use
io.read_metadata()going forwards.To switch to using metadata as a pandas DataFrame (recommended):
Iterate through strains:
metadata.items()->metadata.iterrows()Check strain presence:
strain in metadata->strain in metadata.indexCheck field presence:
field in metadata[strain]->field in metadata.columnsGet metadata for a strain:
metadata[strain]->metadata.loc[strain]Get field for a strain:
metadata[strain][field]->metadata.at[strain, field]
To keep using metadata in a dictionary:
metadata = read_metadata(args.metadata) metadata.insert(0, "strain", metadata.index.values) columns = metadata.columns metadata = metadata.to_dict(orient="index")
Features
Bug Fixes
filter: Rename internal force inclusion filtering functions #1006 (@victorlin)
16.0.3 (6 July 2022)
Bug Fixes
filter: Move
register_argumentsto the top of the module for better readability #995. (@victorlin)filter: Fix a regression introduced in 16.0.2 that caused grouping with subsampled max sequences and force-included strains to fail in a data-specific way #1000. (@huddlej)
16.0.2 (30 June 2022)
Bug Fixes
The entropy panel was unavailable if mutations were not translated #881. This has been fixed by creating an additional
annotationsblock inaugur ancestralcontaining (nucleotide) genome annotations in the node-data #961 (@jameshadfield)ancestral: WARNINGs to stdout have been updated to print to stderr #961 (@jameshadfield)
filter: Explicitly drop date/year/month columns from metadata during grouping. #967 (@victorlin)
This fixes a bug #871 where
augur filterwould crash with a crypticValueErrorifyearand/ormonthis a custom column in the input metadata and also included in--group-by.
filter: Fix duplicates that may appear in metadata when using
--include/--include-wherewith subsampling #986 (@victorlin)
16.0.1 (21 June 2022)
Bug Fixes
16.0.0 (16 June 2022)
Major Changes
filter: Error when any group-by column is not found #933 (@victorlin)
Check your workflows for any new errors that may arise from this.
parse: Error on duplicates instead of silently passing #918 (@victorlin)
Check your workflows for any new errors that may arise from this.
utils: Remove
utils.myopen()#926 (@victorlin)Use
io.open_file()going forwards.
Moved the following internal functions #929, #923 (@victorlin):
utils.read_vcf->io.read_vcfutils.run_shell_command->io.run_shell_commandutils.shquote->io.shquoteutils.ambiguous_date_to_date_range->dates.ambiguous_date_to_date_rangeutils.is_date_ambiguous->dates.is_date_ambiguousutils.get_numerical_date_from_value->dates.get_numerical_date_from_valueutils.get_numerical_dates->dates.get_numerical_datesDrop support for dict type as the first parameter #934
filter.write_vcf->io.write_vcf
Features
Add the measurements subcommand with two sub-subcommands, export and concat #879 (@joverlee521)
filter: Report min and max date separately #930 (@victorlin)
export v2: Allow the color scale type to be temporal #969 (@jameshadfield)
Handle
FileNotFoundErrorand unexpected exceptions gracefully #914 (@victorlin)
Bug Fixes
filter: Properly handle error on duplicates #918 (@victorlin)
filter: Reorganize Cram test files #943 (@victorlin)
filter: Reword comment on vcftools #924 (@victorlin)
io: Split io.py into smaller files under new io/ #949 (@victorlin)
io: Add tests for
io.open_file()#926 (@victorlin)Move AugurError to new errors.py, replace RuntimeError #921 (@victorlin)
Remove internal usage of
utils.read_metadata()#934, #972 (@victorlin)schemas: Add missing display_default properties for Auspice config v2 #916 (@tsibley)
CI: Split codecov into separate job, combine coverage of all matrix jobs #968 (@tsibley)
CI: Temporarily disable failing test #962 (@victorlin)
CI: pip install without editable mode #956 (@victorlin)
CI: Include functional tests in code coverage #899 (@huddlej)
CI: Move –quiet flag to accommodate snakemake=7.7.0 behavior #927 (@victorlin)
CI: Move docker rebuild step to release workflow #912 (@victorlin)
Update release process #913 (@victorlin)
15.0.2 (5 May 2022)
Bug Fixes
docs: Fix API documentation rendering and add page for
iomodule #896 (@joverlee521)CI: Use GitHub Actions for release process #904 (@victorlin)
utils: Fix branch length annotations in
json_to_treefunction #908 (@huddlej)export v2: Use io.read_metadata during export, fixing a bug caused when the user’s input metadata does not have any valid strain id columns #909 (@huddlej)
CI: Call new GitHub Actions workflow to rebuild images #910 (@victorlin)
15.0.1 (25 April 2022)
Bug Fixes
15.0.0 (15 April 2022)
Major Changes
export: Move extensions block to meta #888 (@corneliusroemer)
Note: this is technically a breaking change, but the misplaced extensions block was added in version 14.1.0 and intended for internal use by Nextclade. We don’t expect any users to be impacted by this.
Features
Bug Fixes
14.1.0 (31 March 2022)
Features
schemas: Extend export v2 schema to support an array of trees #851 (@tsibley)
schemas: Add JSON schemas for our root-sequence and tip-frequencies sidecars #852 (@tsibley)
schemas: Add JSON schema for measurements sidecar #859 (@joverlee521)
filter: Send warnings to stderr to be consistent with other warnings #862 (@victorlin)
export: Allow an extensions block in auspice config & dataset JSONs #865 (@jameshadfield)
export: Allow skipping of input/output schema validation #865 (@jameshadfield)
export: Order keys in dataset for easier reading #868 (@jameshadfield)
Bug Fixes
parse: Fix typo in internal variable name #850 (@emmahodcroft)
14.0.0 (8 February 2022)
Major Changes
Drop support for Python 3.6, add support for 3.9 and 3.10 #822 (@victorlin)
Features
refine: Enable bootstrap support by passing confidence values through to Auspice JSONs from
augur refinenode data JSONs #839 (@huddlej)tree: Allow users to override default tree builder arguments with a new
--override-default-argsflag #839 (@huddlej)clades: Allow descendant clades to be defined by explicitly inheriting from ancestral clade names #846 (@corneliusroemer)
Bug Fixes
tree: Fix segmentation fault that can occur when user-provided tree builder args conflict hardcoded defaults for IQ-TREE’s. The new
--override-default-argsflag allows users to override the defaults that conflict with their values. #839 (@huddlej)filter/utils: fix year-only and numeric date handling #841 (@victorlin)
CI: test earliest supported Biopython versions in matrix, remove redundant installs #843 (@victorlin)
13.1.2 (28 January 2022)
Features
Bug Fixes
13.1.1 (21 January 2022)
Bug Fixes
13.1.0 (10 December 2021)
Features
schemas: Add “$id” key to Auspice config schemas so we have a way of referring to these. #806 (@tsibley)
Bug Fixes
filter: Fix groupby with incomplete dates. #808 (@victorlin)
13.0.4 (8 December 2021)
Bug Fixes
dependencies: Replace deprecated mutable sequence interface for BioPython. #788 (@Carlosbogo)
dependencies: Fix backward compatibility with BioPython. #801 (@huddlej)
data: Add latitude and longitude details for “Reunion”. #791 (@corneliusroemer)
filter: Use pandas functions to determine subsample groups. #794 and #797 (@victorlin)
filter: Add clarity to help message and output of probabilistic sampling. #792 (@victorlin)
13.0.3 (19 November 2021)
Bug Fixes
13.0.2 (12 October 2021)
Bug Fixes
13.0.1 (1 October 2021)
Bug Fixes
13.0.0 (17 August 2021)
Major Changes
filter: Skip metadata records with ambiguous month information in the
datecolumn when grouping by month instead of randomly generating month values for those records. This change alters the behavior of thefiltercommand for metadata with ambiguous month values. For these data, consider using--group-by yearinstead of--group-by year month. #761 (@huddlej)
Features
filter: When grouping by year or month, report the number of strains skipped due to ambiguous year and month both in the summary report at the end of filtering and in the
--output-logcontents #761 (@huddlej)
12.1.1 (13 August 2021)
Bug Fixes
12.1.0 (12 August 2021)
Features
export: Add support for custom legend and color scale specifications in Auspice config files #727 (@jameshadfield)
utils: Add support for compressed strain name files (e.g., “include.txt.gz”) #730 (@benjaminotter)
filter: Rewrite internal logic to use pandas DataFrames (#743), define filters and subsampling logic as individual functions (#745 and #746), and iterate through chunks of metadata instead of loading all records into memory at once (#750) (@tsibley, @huddlej)
Bug Fixes
12.0.0 (13 April 2021)
Major Changes
filter: Date bounds (
--min-dateand--max-date) are now inclusive instead of exclusive such that records matching the given dates will pass date filters #708 (@benjaminotter)
Bug Fixes
refine: Recommend an alternate action when skyline optimization fails #712 (@huddlej)
Features
distance: Count insertion/deletion events once in pairwise distances #698 (@huddlej, @benjaminotter)
distance: Optionally ignore specific list of characters defined in a distance map’s top-level
ignored_characterslist #707 (@benjaminotter)filter: Allow
--subsample-max-sequenceswithout--group-by#710 (@benjaminotter)tree: Prefer
iqtree2binary overiqtreewhen possible #711 (@benjaminotter)
11.3.0 (19 March 2021)
Bug Fixes
Features
io: Add new
iomodule withopen_file,read_sequences, andwrite_sequencesfunctions that support compressed inputs and outputs #652parse, index, filter, mask: Add support for compressed inputs/outputs #652
export v2: Add optional
data_provenancefield to auspice JSON output for better provenance reporting in Auspice #705
11.2.0 (8 March 2021)
Bug Fixes
Documentation
Features
filter: Enable filtering by metadata only such that sequence inputs/outputs are optional and metadata/strain list outputs are now possible #679
filter: Enable extraction of sequences from multiple lists of strains with a new
--exclude-allflag and support for multiple inputs to the--includeargument #679
11.1.2 (16 February 2021)
Bug Fixes
index: Remove call to deprecated BioPython SeqIO.close method #684
11.1.1 (16 February 2021)
Bug Fixes
11.1.0 (12 February 2021)
Bug Fixes
Features
11.0.0 (22 January 2021)
Major Changes
filter: Use probabilistic sampling by default when requesting a maximum number of sequences to subsample with
--subsample-max-sequences. Adds--no-probabilistic-samplingflag to disable this default behavior and prevent users from requesting fewer maximum sequences than there are subsampling groups. #659
10.3.0 (14 January 2021)
Bug Fixes
Features
10.2.0 (1 January 2021)
Features
filter: Add
--probablistic-samplingflag to allow subsampling with--subsample-max-sequenceswhen the number of groups exceeds the requested number of samples #629scripts: Add script to identify emerging clades from existing Nextstrain build JSONs #653
docs: Add instructions to update conda installations prior to installing Augur #655
10.1.1 (16 November 2020)
Bug Fixes
dependencies: Require the most recent minor versions of TreeTime (0.8.X) to fix numpy matrix errors #633
10.1.0 (13 November 2020)
Features
10.0.4 (6 November 2020)
Bug Fixes
10.0.3 (23 October 2020)
Bug Fixes
10.0.2 (8 September 2020)
Bug Fixes
10.0.1 (8 September 2020)
Bug Fixes
ancestral: Clarify default values for inference of ambiguous bases #613
10.0.0 (17 August 2020)
Major Changes
Remove Snakemake as a dependency of the augur Python package #557
utils:
read_colorsrefactor #588raises an exception when the requested color file is missing instead of printing a warning to stdout
splits out logic to parse colors file into separate classes (
util_support/color_parser.pyandutil_support/color_parser_line.py) with unit tests
utils:
read_metadatainterface improvementsutils:
read_node_datainterface improvements #595, #605exits with a nonzero code when node data node names don’t match tree nodes and when the input tree cannot be loaded
refactors logic to read node data into separate classes with unit tests
Bug Fixes
ancestral: Fix docstring for
collect_mutations_and_sequences4c474a9parse: Fix date parsing bug caused by a change in the API for
parse_time_stringin pandas 1.1.0 #601refine: Enable divergence unit scaling without timetree e9b3eec
tree: Use IQ-TREE’s
-nt AUTOmode when users request more threads than there are input sequences, avoiding an IQ-TREE error #598
Features
9.0.0 (29 June 2020)
Major Changes
align: The API to the
read_sequencesfunction now returns a list of sequences instead of a dictionary #536
Bug Fixes
align: Prevent duplicate strains warning when using
--reference-name#536docs: Sync and deduplicate installation documentation from README to main docs #578
export: Flexibly disambiguate multiple publications by the same author #581
frequencies: Avoid interpolation of a single data point during frequency estimation with sparse data #569
parse: Actually remove commas during prettify when this behavior is requested #573
tests: Always use the local helper script (
bin/augur) to run tests instead of any globally installed augur executables #527tree: Keep log files after trees are built #572
utils: Do not attempt to parse dates with only ambiguous months (e.g., 2020-XX-01) #532
utils: Parse
namecolumn of metadata as a data field instead of a pandas DataFrame attribute #564
Features
docs: Updates description of how missing data are handled by
augur traitsfilter: Add support for ISO 8601 dates (YYYY-MM-DD) for
--min-dateand--max-date#568tests: Add tests for utilities (ambiguous date parsing #532 and
run_shell_command#577), parse #573, and translate #546tree: Allow VCF input without an
--exclude-sitesargument #565
8.0.0 (8 June 2020)
Major Changes
utils: Add a consolidated generic
load_mask_sitesfunction and specificread_mask_fileandread_bed_filefunctions for reading masking sites from files. Changes the Python API by moving mask-loading functionality out of augur mask and tree into utils #514 and #550mask: Parse BED files as zero-indexed, half-open intervals #512
Bug Fixes
Features
align: Report insertions stripped during alignment #449
Require minimum pandas version of 1.0.0 #488
parse: Reduce memory use and clarify code with standard Python idioms #496
mask: Allow masking of specific sites passed by the user with
--mask-sitesand masking of a fixed number of sites from the beginning or end of each sequence with--mask-from-beginningand--mask-from-end#512clades, import: Use
defaultdictto simplify code #533tests: Add initial functional tests of the augur command line interface using Cram #542
refine: Add a
--seedargument to set the random seed for more reproducible outputs across runs #542ancestral, refine, and traits: Print the version of TreeTime being used for these commands #552
filter: Add support for flexible pandas-style queries with new
--queryargument #555export: Allow display defaults for transmission lines #561
7.0.2 (7 April 2020)
Bug Fixes
filter: Fix regression introduced in 7.0.0 which caused an error to be raised if a priorities file didn’t include every sequence. Sequences which are not explicitly listed will once again default to a priority of 0. #530
7.0.1 (7 April 2020)
Bug Fixes
Fix typo with Python classifiers in setup.py
7.0.0 (7 April 2020)
Major Changes
Features
improve testing by
align: reverse complement sequences when necessary using mafft’s autodirection flag #467
align: speed up replacement of gaps with “ambiguous” bases #474
mask: add support for FASTA input files #493
traits: bump TreeTime version to 0.7.4 and increase maximum number of unique traits allowed from 180 to 300 #495
Bug Fixes
align: enable filling gaps in input sequences even if no reference is provided instead of throwing an exception #466
align: detect duplicate sequences by comparing sequence objects instead of (often truncated) string representations of those objects #468
import_beast: use raw strings for regular expressions to avoid syntax errors in future versions of Python #469
scripts: update exception syntax to new style #484
filter: fail loudly when a given priority file is invalid and exit instead of just printing an error #487
Documentation
6.4.3 (25 March 2020)
Bug Fixes
align: Remove reference sequence from alignments even when no gaps exist in input sequences relative to the reference. Thank you @danielsoneg! #456
Documentation
Reorganize README, improve findability of documentation, and add separate dev docs. #461
6.4.2 (17 March 2020)
Bug Fixes
Require Snakemake less than 5.11 to avoid a breaking change. The
--coresargument is now required by 5.11, which will affect many existing augur-based workflows. Reported upstream as snakemake/snakemake#283.align: Run mafft with the
--nomemsaveoption. This makes alignments of sequences over 10k in length run much, much faster in the general case and shouldn’t cause issues for most modern hardware. We may end up needing to add an off-switch for this mode if it causes issues for other users of augur, but the hope is that it will make things just magically run faster for most folks! There is likely more tuning that could be done with mafft, but this is a huge improvement in our testing. #458align: Ignore blank lines in
--includefiles. Thanks @CameronDevine! #451align: Properly quote filenames when invoking mafft. Thanks @CameronDevine! #452
6.4.1 (4 March 2020)
Bug Fixes
export: AA labels are now exported for branches where a clade is also labeled See PR 447
export / validation: a dataset title is no longer required
release script now works on MacOS & code-signing is optional See PR 448
traits: Missing data is correctly handled
6.4.0 (26 February 2020)
Features
align: New sequences can now be added to an existing alignment. #422
align: Multiple sequence files can be provided as input. #422
align: Extra debugging files such as
*.pre_aligner.fastaand*.post_aligner.fastaare no longer produced by default. To request them, pass the--debugflag. #422align: De-duplicate input sequences, with a warning. #422
export v2: Add support for the
branch_labelproperty indisplay_defaults, which was recently added to Auspice. #445
Bug fixes
align: Exits with an error earlier if arguments are invalid instead of only printing a warning. #422
align: Performs more error checking and clarifies the help and error messages. #422
export v2: Traits which are filters but not colorings are now exported as well, instead of being left out. #442
export v2: Exits non-zero when validation fails, instead of masking errors. #441
validate: In order to improve clarity, messages now include the filenames involved and distinguish between schema validation and internal consistency checks. #441
6.3.0 (13 February 2020)
Features
Augur
refine,ancestralandtraitsnow use the upgraded TreeTime v0.7 This should have a number of under-the-hood improvements. See PR 431ancestral: New options to either
--keep-ambiguousor--infer-ambiguous. If using--infer-ambiguousthe previous behavior will be maintained in which tips withNwill have their nucleotide state inferred. If using--keep-ambiguous, these tips will be left asN. With this upgrade, we are still defaulting to--infer-ambiguous, however, we plan to swap default to--keep-ambiguousin the future. If this distintion matters to you, we would suggest that you explicitly record--keep-ambiguous/--infer-ambiguousin your build process. Also part of PR 431traits: Allow input of
--weightswhich references a.tsvfile in the following format:division Hubei 10.0 division Jiangxi 1.0 division Chongqing 1.0
where these weights represent equilibrium frequencies in the CTMC transition model. We imagine the primary use of user-specified weights to correct for strong sampling biases in available data. See PR 443
Bug fixes
Improvements to make shell scripts run more easily on Windows. See PR 437
6.2.0 (25 January 2020)
Features
refine: Include
--divergence-unitsoption to distinguish betweenmutationsandmutations-per-site. Keepmutations-per-siteas default behavior. See PR 435
Bug fixes
utils: Support v2 auspice JSONs in json_to_tree utility function. See PR 432
6.1.1 (17 December 2019)
Bug fixes
frequencies: Fix bug in string matching for weighted frequencies introduced in v6.1.0. See PR 426.
6.1.0 (13 December 2019)
Features
export: Include
--descriptionoption to pass in a Markdown file with dataset description. This is displays in Auspice in the footer. For rationale, see Auspice issue 707 and for Augur changes see PR 423.
Bug fixes
frequencies: Fix weighted frequencies when weight keys are unrepresented. See PR 420.
6.0.0 (10 December 2019)
Overview
Version 6 is a major release of augur affecting many augur commands. The format
of the exported JSON (v2) has changed and now merges the previously separate
files containing tree and meta information. To maintain backward compatibility,
the export command was split into export v1 (old) and export v2 (new).
Detailed release notes are provided in the augur documentation on
read-the-docs.
For a migration guide, consult
migrating-v5-v6.
Major features / changes
export: Swap from a separate
_tree.jsonand_meta.jsonto a single “unified”dataset.jsonoutput fileexport: Include additional command line options to alleviate need for Auspice config
export: Include option for reference sequence output
export: Move to GFF-style annotations
export: Validate exported JSONs against schema
ancestral: Allow output of FASTA and JSON files
import: Include
import beastcommand to import labeled BEAST MCC treeparse: Include
--prettify-fieldsoption to cleanup metadata fieldsDocumentation improvements
Minor features / changes
colors.tsv: Allow whitespace, but insist on tab delimiting
lat_longs.tsv: Allow whitespace, but insist on tab delimiting
Remove code for old “non-modular” augur, old “non-modular” builds and Python tests
Improve test builds
filter: More interpretable output of how many sequences have been filtered
filter: Additional flag
--subsample-seedto seed the random number generator and thereby make subsampling reproduciblesequence-traits: Numerical output as originally intended, but required an Auspice bugfix
traits: Explanation of what is considered missing data & how it is interpreted
traits: GTR models are exported in the output JSON for better accountability & reproducibility
5.4.1 (12 November 2019)
Bug fixes
export v1: Include
--minify-jsonoption that was mistakenly not included in PR 398. See PR 409
5.4.0 (7 November 2019)
Features
frequencies: Include
--minimal-clade-size-to-estimatecommand line option. See PR 383lbi: Include
--no-normalizationcommand line option. See PR 380
Compatibility fixes
export: Include
v1subcommand to allow forwards compatibiliy with Augur v6 builds. See PR 398
Bug fixes
export: Include warning if using a mismatched v6 translate file. See PR 392
frequencies: Fix determination of interval for clipping of non-informative pivots
5.3.0 (9 September 2019)
Features
export: Improve printing of error messages with missing or conflicting author data. See issue 274
filter: Improve printing of dropped strains to include reasons why strains were dropped. See PR 367
refine: Add support for command line flag
--keep-polytomiesto not resolve polytomies when producing a time tree. See PR 345
Bug fixes
Catch and throw error when there are duplicate strain names. See PR 356
Fix missing annotation of “parent” attribute for the root node
Run shell commands with more robust error checking. See PR 350
Better handling of rerooting options for trees without temporal information. See issue 348
Data
Small fixes in geographic coordinate file
5.2.1 (4 August 2019)
Bug fixes
Print more useful error message if Python recursion limit is reached. See issue 328
Print more useful error message if vcftools if missing. See PR 312
Development
Significantly relax version requirements specified in setup.py for biopython, pandas, etc… Additionally, move lesser used packages (cvxopt, matplotlib, seaborn) into an “extras_require” field. This should reduce conflicts with other pip installed packages. See PR 323
Data
Include additional country lat/longs in base data
5.2.0 (23 July 2019)
Features
ancestral: Adds a new flag
--output-sequencesand logic to support saving ancestral sequences and leaves from the given tree to a FASTA file. Also adds a redundant, more specific flag--output-node-datathat will replace the current--outputflag in the next major version release of augur. For now, we issue a deprecation warning when the--outputflag is used. Note that FASTA output is only allowed for FASTA inputs and not for VCFs. We don’t allow FASTA output for VCFs anywhere else and, if we did here, the output files would be very large. See PR 293frequencies: Allow
--method kdeflag to compute frequencies via KDE kernels. This complements existing method of--method diffusion. Generally, KDE frequencies should be more robust and faster to run, but will not project as well when forecasting frequencies into the future. See PR 271
Bug fixes
ancestral, traits, translate: Print warning if supplied tree is missing internal node names (normally provided by running
augur refine). See PR 283Include pip in Conda enviroment file. See PR 309
Documentation
Document environment variables respected by Augur
Development
Remove matplotlib and seaborn from
setup.pyinstall. These are still called a few places in augur (liketiters.validate()), but it was deemed rare enough that remove this fromsetup.pywould ease general install for most users. Additionally, the ipdb debugger has been moved to dev dependencies. See PR 291Refactor logic to read trees from multiple formats into a function. Adds a new function
read_treeto theutilsmodule that tries to safely handle reading trees in multiple input formats. See PR 310
5.1.1 (1 July 2019)
Features
tree: Add support for the GTR+R10 substitution model.
tree: Support parentheses in node names when using IQ-TREE.
Bug fixes
Use the center of the UK for its coordinates instead of London.
filter: Mark
--outputrequired, which it always was but wasn’t marked.filter: Avoid error when no excluded strains file is provided.
export: Fix for preliminary version 2 schema support.
refine: Correct error handling when the tree file is missing or empty.
Documentation
Add examples of Augur usage in the wild.
Rename and reorganize CLI and Python API pages a little bit to make “where do I start learning to use Augur?” clearer to non-devs.
Development
Relax version requirements of pandas and seaborn. The hope is this will make installation smoother (particularly alongside other packages which require newer pandas versions) while not encountering breaking changes in newer versions ourselves.
5.1.0 (29 May 2019)
Documentation
Documentation is now available online for the augur CLI and Python API via Read The Docs: https://nextstrain-augur.readthedocs.io. The latest version on RTD points to the git master branch, and the stable version to the most recent tagged release. Instructions for building the docs locally are in the README.
5.0.0 (26 May 2019)
Features
ancestral: New option to
--keep-ambiguous, which will not infer nucleotides at ambiguous (N) sites on tip sequences and instead leave as ‘N’ See PR 280.ancestral: New option to
--keep-overhangs, which will not infer nucleotides for gaps on either side of the alignment and instead leave as ‘-’. See PR 286.clades: This module has been reconfigured to identify clade defining mutations on top of a reference rather than identifying mutations along the tree. The command line arguments are the same except for the addition of
--reference, which explicitly passes in a reference sequence. If--referenceis not defined, then reference will be drawn from the root node of the phylogeny by looking forsequenceattribute attached to root node of--tree. See PR 288.refine: Revise rooting behavior. Previously
--roottook ‘best’, ‘residual’, ‘rsq’ and ‘min_dev’ as options. In this update--roottakes ‘best’, least-squares’, ‘min_dev’ and ‘oldest’ as rooting options. This eliminates ‘residual’ and ‘rsq’ as options. This is a backwards-incompatible change. This requires updating TreeTime to version 0.5.4 or above. See PR 263.refine: Add
--keep-rootoption that overrides--rootspecification to preserve tree rooting. See PR 263.refine: Add
--covarianceand--no-covarianceoptions that specify TreeTime behavior. See PR 263.titers: This command now throws an
InsufficientDataExceptionif there are not sufficient titers to infer a model. This is paired with a new--allow-empty-modelflag that proceeds past theInsufficientDataExceptionand writes out a model JSON corresponding to an ‘empty’ model. See PR 281.By default JSONs are written with
index=1to give a pretty-printed JSON. However, this adds significant file size to large tree JSONs. If the environment variableAUGUR_MINIFY_JSONis set then minified JSONs are printed instead. This mirror the explicit--minify-jsonargument available toaugur export. See PR 278.
Bug fixes
export: Cast numeric values to strings for export. See issue 287.
export: Legend order preserves ordering passed in by user for traits that have default colorings (’country’ and ‘region’). See PR 284.
refine: Previously, the
--rootargument was silently ignored when no timetree was inferred. Re-rooting with an outgroup is sensible even without a timetree. See PR 282.
4.0.0 (24 April 2019)
Features
distance: New interface for specifying distances between sequences. This is a backwards-incompatible change. Refer to
augur distance --helpfor all the details.export: Add a
--minify-jsonflag to omit indentation in Auspice JSONs.
Bug fixes
frequencies: Emit one-based coordinates (instead of zero-based) for KDE-based mutation frequencies
Data
Include additional country lat/longs in base data
3.1.8 (13 February 2019)
Bug fixes
titers: fix calculation of
mean_potentencyfor model export
3.1.7 (5 February 2019)
Bug fixes
Update to TreeTime 0.5.3
tree: Fix bug in printing causing errors in Python versions <3.6
tree: Alter site masking to not be so memory intensive
3.1.6 (29 January 2019)
Features
filter: Allow negative matches to
--exclude-where. For example,--exclude-where country!=usawould exclude all samples where metadatacountrydoes not equalusa.tree: Allow
--exclude-sitesto work with FASTA input. Ensure that indexing of input sites is one-based.
Bug fixes
fix loading of strains when loading titers from file, previously strains had not been filtered to match the tree appropriately
3.1.5 (13 January 2019)
Features
frequencies: Add
--ignore-charand--minimal-clade-sizeas options.frequencies: Include
--stiffnessand--inertiaas options.titers: Allow multiple titer date files in
--titersimport.
Bug fixes
filter: Fix
--non-nucleotidecall to include?as allowed character.tree: Fix
--method raxmlto properly delimit interim RAxML output so that simultaneous builds don’t conflict.
Data
Include additional country lat/longs in base data
3.1.4 (1 January 2019)
Bug fixes
frequencies: Include
countsinaugur frequenciesoutput JSON to support downstream plotting.
Data
Include additional country lat/longs in base data
3.1.3 (29 December 2018)
Features
filter: Add
--non-nucleotideoption to remove sequences with non-conforming nucleotide characters.
Bug fixes
Revise treatment of
-,inaugur parseto leave-as is and remove white space. Also delimit[and]to_.Fix bug in naming of temp IQTREE fixes to prevent conflicts from simultaneous builds.
Data
Include additional country lat/longs in base data
Development
Remove non-modular measles build in favor of nextstrain/measles repo.
3.1.2 (21 December 2018)
Bug fixes
Update dependencies
3.1.1 (21 December 2018)
Bug fixes
filter: Fix
--include-where. Adds anall_seqvariable needed by the logic to include records by value. This was previously working for VCF but threw an exception for sequences in FASTA format.Update flu reference viruses and lat longs.
Update dependencies
3.1.0 (18 December 2018)
Features
reconstruct-sequences: Include
augur reconstruct-sequencesmodule that reconstructs alignments from mutations inferred on the treedistance: Include
augur distancemodule that calculates the distance between amino acid sequences across entire genes or at a predefined subset of siteslbi: Include
augur lbimodule that calculates local branching index (LBI) for a given tree and one or more sets of parameters.frequencies: Include
--method kdeas option toaugur frequencies, separate from the existing--method diffusionlogic. KDE frequencies are faster and better for smaller clades but don’t extrapolate as well as diffusion frequencies.titers: Enable annotation of nodes in a tree from the substitution model
3.0.5.dev1 (26 November 2018)
Bug fixes
translate: Nucleotide (”nuc”) annotation for non-bacterial builds starts at 0 again, not 1, fixing a regression.
Documentation
Schemas: Correct coordinate system description for genome start/end annotations.
3.0.4.dev1 (26 November 2018)
Bug fixes
validate: Fix regression for gene names containing an asterisk.
Development
Fix Travis CI tests which were silently not running.
3.0.3.dev1 (26 November 2018)
Features
refine: Add a
--clock-std-devoptiontraits: Add a
--sampling-bias-correctionoption for mugration modelvalidate: Gene names in tree annotations may now contain hyphens. Compatible with Auspice version 1.33.0 and later.
All JSON is now emitted with sorted keys, making it easier to diff and run other textual comparisons against output.
Bug fixes
filter: Only consider A, T, C, and G when calculating sequence length for the
--min-lengthoption.filter: Allow comments in files passed to
--exclude.filter: Ignore case when matching trait values against excluded values.
Normalize custom geographic names to lower case for consistent matching.
Data
Fix typo in geographic entry for
netherlands.Schemas: Reconcile naming patterns used in gene definitions and tree annotations.
Development
Upgrade TreeTime dependency to 0.5.x and at least 0.5.1.
Add an
environment.ymlfile for use withconda env create.Stop testing under Python 2.7 on Travis CI.
3.0.2.dev1 (27 September 2018)
Bug fixes
translate: Fix broken
--helpmessage
3.0.1.dev1 (27 September 2018)
Features
align and tree: The –nthreads option now accepts the special value “auto” to automatically set the number of threads to the number of CPU cores available.
Alias
augur --versiontoaugur version
Bug fixes
tree: The –nthreads option is now respected. Previously all tree builders were ignoring the value and using either 2 threads (RAxML, IQ-TREE) or as many threads as cores (FastTree, if the OpenMP version).
translate: Check for and, if necessary pad, nucleotide sequences which aren’t a multiple of 3 earlier to avoid errors later.
export: Optionally write inferred nucleotide and amino acid sequences (or mutations) to a separate file.
export: Omit genes with no amino acid mutations.
validate: Allow underscores in gene names.
refine: Remove unused –nthreads argument.
ancestral, filter, tree, refine: Exit 1 instead of -1 on error.
Print the help message, instead of throwing an exception, when
auguris run without arguments.
Documentation
Briefly describe each command in its
--helpoutput and in the globalaugur --helpoutput.Revamp README to emphasize new, modular augur and make it suitable for inclusion on PyPi.
Reconciled conflicting license declarations; augur is AGPLv3 (not MIT) licensed like the rest of Nextstrain.
Include URLs for bug reports, the change log, and the source on PyPi.
Data
Geographic coordinates added for the Netherlands and the Philippines.
Development
Reset the
releasebranch when rewinding a failed local release process.Refactor the augur program and command architecture for improved maintainability.
3.0.0.dev3 (4 September 2018)
Development
Use an allowed Topic classifier so we can upload to PyPi
Ignore distribution egg-info build files
3.0.0.dev2 (4 September 2018)
Features
Export: Add safety checks for optional annotations and geo data
Include more lat/longs in the default geo data
Development
Add release tooling
Document the release process and a few development practices
Travis CI: Switch to rebuilding the Docker image only for new releases
Remove ebola, lassa, tb, WNV, and zika builds now in their own repos. These builds are now available at URLs like https://github.com/nextstrain/ebola, for example.
3.0.0.dev1 (unreleased)
Development
Start versioning augur beginning with 3.0.0. A new
augur versioncommand reports the running version.