]>
wolfpit.net Git - experiments/OpenNMT-py/.git/log
Vincent Nguyen [Thu, 31 Jan 2019 08:52:35 +0000 (09:52 +0100)]
Bump minor version (#1243)
flauted [Thu, 31 Jan 2019 08:19:26 +0000 (03:19 -0500)]
Add unit tests for embeddings module. (#1241)
Guillaume Klein [Wed, 30 Jan 2019 15:30:15 +0000 (16:30 +0100)]
Fix sequence length filtering (#1239)
Guillaume Klein [Wed, 30 Jan 2019 12:53:17 +0000 (13:53 +0100)]
Read dataset files in bytes, decode to utf-8 later (#1238)
Guillaume Klein [Wed, 30 Jan 2019 12:52:50 +0000 (13:52 +0100)]
Fix incorrect self attention type checking (#1237)
self_attn is an attribute of `TransformerDecoderLayer`, not
`TransformerDecoder`.
Guillaume Klein [Tue, 29 Jan 2019 17:45:03 +0000 (18:45 +0100)]
Option to pad the vocabulary size to a multiple (#1235)
This is required to execute some code paths (e.g. Tensor Cores on
NVIDIA cards).
flauted [Tue, 29 Jan 2019 13:59:06 +0000 (08:59 -0500)]
Add from_opt constructor to all encs/decs. (#1230)
* Add from_opt constructor to all encs/decs.
Paul Tardy [Tue, 29 Jan 2019 13:26:56 +0000 (14:26 +0100)]
Release running_lock on exceptions (#1233)
* Release running_lock on exceptions
Guillaume Klein [Tue, 29 Jan 2019 13:08:10 +0000 (14:08 +0100)]
Fix batch type "tokens" (#1234)
Vincent Nguyen [Tue, 29 Jan 2019 10:37:41 +0000 (11:37 +0100)]
Update .travis.yml
Vincent Nguyen [Tue, 29 Jan 2019 09:46:33 +0000 (10:46 +0100)]
Update .travis.yml
Ben Peters [Tue, 29 Jan 2019 08:02:32 +0000 (08:02 +0000)]
Clean decoder code, remove unnecessary attributes (#1219)
* remove unused attributes
* remove self._copy
* remove unnecessary attributes from cnn decoder
* remove unnecessary attributes from transformer decoder
* remove another attribute, update comments
* remove unnecessary attribute from TransformerEncoder
Guillaume Klein [Sun, 27 Jan 2019 16:16:41 +0000 (17:16 +0100)]
Flag to report some translation speed metrics (#1222)
flauted [Sat, 26 Jan 2019 06:40:41 +0000 (01:40 -0500)]
Add envs to make pull request script better for repeat runs. (#1220)
flauted [Fri, 25 Jan 2019 18:53:32 +0000 (13:53 -0500)]
Multilevel text field (#1216)
* Move batching and field logic from inputter to dsets.
* Remove _feature_tokenize from inputter.
* Remove src_lengths from audio.
* Make lengths a torch.int instead of self.dtype in AudioSeqField.
* Don't output src_lengths in AudioDataset.make_examples since they're no longer necessary
* Add temp fix for checking if data is text in make_features.
* First pass at a multi-level field design.
* Remove some unused code.
* Remove batch.src_is_text attr.
* Clean up by adding __iter__ to MultiField.
* Fix extract embeddings.
* Clean build_vocab (including bad indentation level), incorporate #1199, rename TextMultiField attrs.
* Remove make_features.
* Update semantics in direct calls to batch.src, batch.tgt.
* Test for old-style text fields while checking for old-style vocab.
Vincent Nguyen [Thu, 24 Jan 2019 21:22:23 +0000 (22:22 +0100)]
bump version 0.7.1 (#1217)
Ben Peters [Thu, 24 Jan 2019 16:08:14 +0000 (16:08 +0000)]
fix behavior with shard_size == 0 (#1215)
Ben Peters [Thu, 24 Jan 2019 16:07:30 +0000 (16:07 +0000)]
re-add source word logging (#1213)
Zhenxin Fu [Thu, 24 Jan 2019 10:37:35 +0000 (18:37 +0800)]
Fix bug: error when training from checkpoint with fix_word_vecs (#1211)
* fix bug when reloading optimizer with fix_word_vecs
* LINT
flauted [Wed, 23 Jan 2019 20:23:17 +0000 (15:23 -0500)]
Move batching and field logic from inputter to dsets (#1210)
* Move batching and field logic from inputter to dsets.
* Remove _feature_tokenize from inputter.
* Remove src_lengths from audio.
* Make lengths a torch.int instead of self.dtype in AudioSeqField.
* Don't output src_lengths in AudioDataset.make_examples since they're no longer necessary
* Add temp fix for checking if data is text in make_features.
* Fix for #6
* Solution for img model backwards compatibility.
Guillaume Klein [Wed, 23 Jan 2019 16:30:17 +0000 (17:30 +0100)]
Unify loss interface and do not force backpropagation (#1204)
Shujian2015 [Wed, 23 Jan 2019 16:29:22 +0000 (11:29 -0500)]
Update Summarization.md (#1205)
Elijah Rippeth [Wed, 23 Jan 2019 16:26:24 +0000 (11:26 -0500)]
make imports conditional on CUDA toolkit existence. (#1206)
Vincent Nguyen [Wed, 23 Jan 2019 11:41:50 +0000 (12:41 +0100)]
Revert " Move batching and field logic from inputter to dsets (#1196)" (#1203)
This reverts commit
bd465cc85a5ff9b49bfc96a102708d71e772696f .
flauted [Wed, 23 Jan 2019 09:02:16 +0000 (04:02 -0500)]
Move batching and field logic from inputter to dsets (#1196)
* Move batching and field logic from inputter to dsets.
* Remove _feature_tokenize from inputter.
* Remove src_lengths from audio.
* Make lengths a torch.int instead of self.dtype in AudioSeqField.
* Don't output src_lengths in AudioDataset.make_examples since they're no longer necessary
* Add temp fix for checking if data is text in make_features.
Shujian2015 [Wed, 23 Jan 2019 08:35:04 +0000 (03:35 -0500)]
Update translator.py (#1202)
Vincent Nguyen [Tue, 22 Jan 2019 17:17:05 +0000 (18:17 +0100)]
Enable big file translation (for massive back translation) (#1192)
* Enable big file translation (for massive back translation) without running out of memory.
default shard_size=10000, reclass split_corpus in utils/misc.py
flauted [Tue, 22 Jan 2019 14:51:59 +0000 (09:51 -0500)]
Move copy scores from text dataset to copy_generator. (#1197)
flauted [Tue, 22 Jan 2019 14:34:41 +0000 (09:34 -0500)]
Smarter optional dependency imports in the datasets. (#1195)
flauted [Mon, 21 Jan 2019 19:47:05 +0000 (14:47 -0500)]
Fix serialization bug (#1188)
* In dataset base, remove __reduce_ex__ and override __getattr__.
``torchtext.Dataset.__getattr__`` is a generator. That doesn't
play well with pickle. Returning a generator (when appropriate)
seems to fix the issue without changing API.
Ben Peters [Mon, 21 Jan 2019 16:48:02 +0000 (16:48 +0000)]
Dataset data type attr (#1184)
* remove reference to dataset's data_type attribute from translator
* remove data.data_type from translation.py
* remove data_type class attribute from datasets
* remove data_type attribute from trainer
* remove dataset loading to get data_type
* refactor make_features
* fix bug in _score_target
* simplify copy attention checks
Frederic Bergeron [Mon, 21 Jan 2019 07:38:51 +0000 (16:38 +0900)]
Fixed issue #1186. (#1191)
Guillaume Klein [Sun, 20 Jan 2019 16:14:49 +0000 (17:14 +0100)]
Do not reset LR scheduling step with -reset_optim keep_states (#1190)
Ben Peters [Fri, 18 Jan 2019 19:05:53 +0000 (19:05 +0000)]
update -tensorboard_log_dir argument (#1183)
Ben Peters [Wed, 16 Jan 2019 18:48:08 +0000 (18:48 +0000)]
Fix 1168 and 1170 (#1182)
* allow old-style vocab loading to build copy attn fields
* fix fields when loading old trained model at translation time
Guillaume Klein [Wed, 16 Jan 2019 14:39:07 +0000 (15:39 +0100)]
Refactor Optimizer class (#1180)
* Refactor Optimizer class
The main change is to serialize the optimizer state and not the
instance itself. Saving the instance restricted changes applied to the
class and the use of optimizer wrappers. This change is backward
compatible with existing checkpoints.
The learning rate management is also updated: the learning rate value
is always computed dynamically based on the initial learning
rate and the current training step. The learning rate decay schedules
are expected to be passed as callable taking a step and returning a
scaling factor.
* Fix step loaded from old checkpoints
* Use functools.partial instead of lambdas
* Use @classmethod instead of @staticmethod
* Some renames
* Move LR scheduling out of Adafactor optimizer
Ben Peters [Tue, 15 Jan 2019 05:36:56 +0000 (05:36 +0000)]
Fix #1172 (#1178)
* update for new style vocab
* remove unnecessary future import
* fix file reading
Daphne Ippolito [Mon, 14 Jan 2019 17:10:38 +0000 (12:10 -0500)]
Improvements to random sampling at decode time. (#1174)
Ratish Puduppully [Mon, 14 Jan 2019 07:42:14 +0000 (07:42 +0000)]
fix for Not restoring state correctly during truncated bptt #1164 (#1171)
* fix for 'Not restoring state correctly during truncated bptt' #1164
Ben Peters [Wed, 9 Jan 2019 08:57:59 +0000 (08:57 +0000)]
Save fields instead of vocab (#1132)
* save fields instead of vocab
* update reading fields from checkpoint
* move logging from utility function
* add function to check for previous vocab format
* update vocab loading
* fix flake8
* do not pop src field from non-text data
* remove hard-coded special tokens, temporarily remove feature support
* change structure of fields created at preprocess time
* pass truncation to fields creation function correcty
* update DatasetBase constructor
* build vocab with new structure of fields
* update expectations of fields at train time, remove collect_features
* update structure of fields at translate time
* update loading from old preprocessed data
* update test for new fields
* move src_field definition location
* do not pop img or audio from build_vocab
* update fields in extract_embeddings
* fix merge
* fix old-style fields code
Daphne Ippolito [Mon, 7 Jan 2019 14:31:42 +0000 (09:31 -0500)]
Add random sampling as alternative to beam search. (#1157)
group.add('--random_sampling_topk', '-random_sampling_topk',
default=1, type=int,
help="""Set this to -1 to do random sampling from full
distribution. Set this to value k>1 to do random
sampling restricted to the k most likely next tokens.
Set this to 1 to use argmax or for doing beam
search.""")
Vincent Nguyen [Wed, 2 Jan 2019 17:47:54 +0000 (18:47 +0100)]
fix torch.no_grad for validation (#1156)
bump version
François Hernandez [Mon, 31 Dec 2018 17:24:38 +0000 (18:24 +0100)]
Revert "Simplify multi headed attention (#1153)" (#1155)
This reverts commit
3d9a4181cc36e1c0817e103905540a3d05726646 .
coder [Mon, 31 Dec 2018 16:35:50 +0000 (08:35 -0800)]
Simplify multi headed attention (#1153)
* Simplify multi headed attention
Vincent Nguyen [Mon, 31 Dec 2018 07:05:58 +0000 (08:05 +0100)]
Merge pull request #1151 from bpopeters/fix-1149
Fix #1149
bpopeters [Mon, 31 Dec 2018 05:42:48 +0000 (21:42 -0800)]
move imports to extract_features
Ben Peters [Fri, 28 Dec 2018 19:26:51 +0000 (11:26 -0800)]
Do not read entire corpus at once when preprocessing or save temp files (#1136)
* do not write temporary shard files
* add shard size assert
* unify expectations of input files and iterators across datasets
* fix argument signature
Ben Peters [Fri, 28 Dec 2018 18:27:10 +0000 (10:27 -0800)]
Simplify dataset loading, remove incorrect documentation (#1133)
* simplify lazily_load_dataset, remove incorrect documentation
* remove one lazily_load_dataset call
* dataset loading to build_dataset_iter
* move nested function, remove incorrect documentation
* begin refactoring DatasetLazyIter
* fix flake8
* refactor DatasetLazyIter
* update doc
* remove train_iter_fct, valid_iter_fct
Vincent Nguyen [Fri, 28 Dec 2018 18:12:18 +0000 (19:12 +0100)]
Merge pull request #1142 from magic282/patch-1
Update optimizers.py
François Hernandez [Fri, 28 Dec 2018 17:11:43 +0000 (18:11 +0100)]
Fix comments in AAN module (#1143)
* Fix comments in AAN module
Ben Peters [Fri, 28 Dec 2018 15:51:36 +0000 (07:51 -0800)]
use torchaudio.legacy.load (#1144)
* use torchaudio.legacy.load
Vincent Nguyen [Fri, 28 Dec 2018 09:24:24 +0000 (10:24 +0100)]
extra space removal
Vincent Nguyen [Thu, 27 Dec 2018 21:31:57 +0000 (22:31 +0100)]
revert travis test
Vincent Nguyen [Thu, 27 Dec 2018 21:18:41 +0000 (22:18 +0100)]
test travis
magic282 [Thu, 27 Dec 2018 12:17:21 +0000 (20:17 +0800)]
Update optimizers.py
Fix `AttributeError: 'Optimizer' object has no attribute 'params'`
Sumith Kulal [Wed, 19 Dec 2018 19:32:54 +0000 (11:32 -0800)]
Exit early if src/tgt files do not exist (#1123)
Ben Peters [Wed, 19 Dec 2018 17:05:47 +0000 (09:05 -0800)]
Do not filter validation data by default (#1118)
* do not filter validation data
* minimum length in filter is now actually minimum length, sensible defaults for maximum sequence length in build_dataset
* add option for optionally filtering validation data
Vincent Nguyen [Wed, 19 Dec 2018 13:50:35 +0000 (14:50 +0100)]
Introducing Adafactor (#1124)
* introducing Adafactor
Ben Peters [Tue, 18 Dec 2018 21:20:07 +0000 (13:20 -0800)]
Unify dataset constructors, simplify dataset building (#1115)
* remove make_text_examples
* remove make_image_examples
* remove make_audio_examples
* remove building dataset from iterator
* remove TextDataset.make_iterator_from_file, combine with make_examples
* combine image's make_iterator_from_file and make_examples
* rename AudioDataset method
* unify dataset constructors
* move _dynamic_dict to DatasetBase
* fix import bug
* fix import error (for real this time)
* change dynamic_dict default value
* dataset creation with iterators (except audio)
* remove hard-coded specials from _dynamic_dict
* do not split in _read_file
* update src argument
* fix silly error
* remove print
Guillaume Klein [Tue, 18 Dec 2018 17:35:29 +0000 (18:35 +0100)]
Small optimizer refactoring (#1119)
* Make MultipleOptimizer expose a param_group attribute
This allows to use the same learning rate update for all optimizers.
* Harmonize learning rate decay definition
* Make each method compute a scaling factor for the given step
* Implement the general learning rate decay as a single formula
* Set initial_accumulator_value argument for Adagrad
* Remove params attribute from the Optimizer class
The parameters are already stored in the underlying optimizer.
Vincent Nguyen [Mon, 17 Dec 2018 14:54:31 +0000 (15:54 +0100)]
Merge pull request #1117 from bpopeters/fix-1116
Fix #1116
bpopeters [Mon, 17 Dec 2018 14:17:07 +0000 (06:17 -0800)]
use specials from tgt, not src
Ben Peters [Sat, 15 Dec 2018 18:46:30 +0000 (18:46 +0000)]
Reduce use of hard-coded values for special symbols, remove unused classes (#1114)
* remove references to constant special symbols
* refactor CopyGenerator to not require tgt_dict, inputters.PAD_WORD
* fix incorrect documentation
* remove hard-coded specials from model_builder
* remove unused methods and attributes
* fix spelling error
* remove hard-coded special tokens from loss computation
* remove hard-coded special tokens
* do not hardcode unk with 0
* use assert instead of raise AssertionError
* remove meaningless line in save_fields_to_vocab, use list comprehension
* refactor merge_vocabs to make it private and not depend on hard-coded special tokens
* simplify load_vocabulary
* fix load_vocabulary bug
* remove UNK, which is unused
* remove get_num_features
* remove unused ShardedCorpusIterator and similar, update comments
* fix variable name, update test_models.py
* remove continuation-line ternary
* fix error message spacing
* fix rnn dim check
* make long args lists vertical
Vincent Nguyen [Wed, 12 Dec 2018 09:53:20 +0000 (10:53 +0100)]
Merge pull request #1110 from elisemicho/master
Update Summarization.md
Vincent Nguyen [Tue, 11 Dec 2018 17:48:51 +0000 (18:48 +0100)]
Merge pull request #1108 from pltrdy/critical_sections
Introduce critical section decorator for translation server
elise [Tue, 11 Dec 2018 16:44:59 +0000 (17:44 +0100)]
Update Summarization.md
Vincent Nguyen [Tue, 11 Dec 2018 16:28:26 +0000 (17:28 +0100)]
Merge pull request #1109 from bpopeters/remove-leftover-comment
remove commented-out method
bpopeters [Tue, 11 Dec 2018 14:44:43 +0000 (14:44 +0000)]
remove commented-out method
Ben Peters [Tue, 11 Dec 2018 13:17:59 +0000 (13:17 +0000)]
Remove unnecessary feature-related code, share more between datasets (#1103)
Paul Tardy [Tue, 11 Dec 2018 13:16:34 +0000 (14:16 +0100)]
Remove config_opts from server (#1107)
* Remove config_opts from server
* Remove unused import
pltrdy [Tue, 11 Dec 2018 11:44:46 +0000 (12:44 +0100)]
Introduce critical section decorator for translation server
Vincent Nguyen [Tue, 11 Dec 2018 09:27:33 +0000 (10:27 +0100)]
Merge pull request #1106 from vince62s/fix-decayopt
fix doc opts
moses [Tue, 11 Dec 2018 08:55:31 +0000 (09:55 +0100)]
fix doc opts
Vincent Nguyen [Tue, 11 Dec 2018 08:22:42 +0000 (09:22 +0100)]
fix tests for pytorch 1.0 (#1105)
Switch to pytorch 1.0
Ben Peters [Mon, 10 Dec 2018 07:59:53 +0000 (07:59 +0000)]
Clean dataset code, make it more in line with torchtext examples (#1102)
* update style and comments
* make sort_key a staticmethod
* make filter_pred function a parameter of the dataset
* update kwargs name in test
* remove obsolete dataset arguments
Ben Peters [Sun, 9 Dec 2018 19:25:37 +0000 (19:25 +0000)]
Simplify dataset building, update some comments (#1098)
Vincent Nguyen [Sun, 9 Dec 2018 18:46:15 +0000 (19:46 +0100)]
Merge pull request #1101 from francoishernandez/cleaner_cache
remove useless code in transformer decoder cache
François Hernandez [Sun, 9 Dec 2018 18:30:55 +0000 (19:30 +0100)]
remove useless code in transformer decoder cache
Vincent Nguyen [Sun, 9 Dec 2018 08:25:27 +0000 (09:25 +0100)]
Merge pull request #1096 from bpopeters/flexible-tokenize
Simplify get_fields
Vincent Nguyen [Sat, 8 Dec 2018 09:25:28 +0000 (10:25 +0100)]
Merge pull request #1097 from guillaumekln/remove-update-state
Remove all Decoder.update_state methods
Guillaume Klein [Sat, 8 Dec 2018 08:30:55 +0000 (09:30 +0100)]
Remove all Decoder.update_state methods
This method provides little benefits over directly updating the state
structure.
bpopeters [Fri, 7 Dec 2018 23:47:11 +0000 (23:47 +0000)]
simplify get_fields, remove redundant code
Vincent Nguyen [Fri, 7 Dec 2018 17:51:33 +0000 (18:51 +0100)]
Merge pull request #1094 from bpopeters/fix-1090
Fix #1090, dimension issues in copy generator loss
bpopeters [Fri, 7 Dec 2018 17:35:58 +0000 (17:35 +0000)]
specify squeeze dimension
Vincent Nguyen [Tue, 4 Dec 2018 19:03:55 +0000 (20:03 +0100)]
Merge pull request #1092 from elisemicho/master
reduce valid batch size in summarization tutorial
elise [Tue, 4 Dec 2018 16:46:22 +0000 (17:46 +0100)]
reduce valid batch size in summarization tutorial
Vincent Nguyen [Sun, 2 Dec 2018 17:02:58 +0000 (18:02 +0100)]
correction #1080 (#1088)
Vincent Nguyen [Sun, 2 Dec 2018 12:59:14 +0000 (13:59 +0100)]
Merge pull request #1086 from vince62s/fix-1080
fix #1080
moses [Sun, 2 Dec 2018 11:11:46 +0000 (12:11 +0100)]
fix #1080
Vincent Nguyen [Sun, 2 Dec 2018 06:40:35 +0000 (07:40 +0100)]
Merge pull request #1084 from guillaumekln/cleanup-transformer-state
Only cache projections in Transformer state
Guillaume Klein [Fri, 30 Nov 2018 17:09:52 +0000 (18:09 +0100)]
Only cache projections in Transformer state
Both translation paths now set with_cache=True, so let's cleanup the
Transformer state to reflect that.
Vincent Nguyen [Fri, 30 Nov 2018 16:05:56 +0000 (17:05 +0100)]
Merge pull request #1081 from guillaumekln/auto-broadcast
Remove some manual tensor expansions
Guillaume Klein [Fri, 30 Nov 2018 15:45:20 +0000 (16:45 +0100)]
Add shape as comment for mask tensors
Vincent Nguyen [Fri, 30 Nov 2018 07:42:10 +0000 (08:42 +0100)]
Update README.md
Guillaume Klein [Thu, 29 Nov 2018 15:51:47 +0000 (16:51 +0100)]
Remove some manual tensor expansions
and rely on the operators broadcasting semantic.
Vincent Nguyen [Wed, 28 Nov 2018 10:53:40 +0000 (11:53 +0100)]
Merge pull request #1077 from vince62s/v0.6
Bump version 0.6
moses [Wed, 28 Nov 2018 10:17:00 +0000 (11:17 +0100)]
Bump version
Sho Hoshino [Mon, 26 Nov 2018 14:57:20 +0000 (23:57 +0900)]
[WIP] Fix attention option when using fast beam search (#1073)
Fix attention option when using fast beam search
Vincent Nguyen [Sun, 25 Nov 2018 13:13:59 +0000 (14:13 +0100)]
Merge pull request #1071 from Belerafon/master
Doc update according ability to reset optimizer
Ben Peters [Sun, 25 Nov 2018 12:41:32 +0000 (12:41 +0000)]
Make loss code more uniform across model types (#1057)
* update SparsemaxLoss signature to match NLLLoss
* have NMTLossCompute take criterion as argument, remove unnecessary attributes from LossComputeBase
* update CopyGeneratorLossCompute and CopyGeneratorCriterion to be compatible with changes to LossComputeBase
* replace CopyGeneratorCriterion with CopyGeneratorLoss (unfinished effort to unify the copy code with the other losses)
* add comments, use inputters.UNK instead of magic number
* rewrite copy loss forward
* make unk index a parameter of CopyGeneratorLoss analogous to the ignore index, update comments
* add paper ref
* fix deprecated F.sigmoid