]>
wolfpit.net Git - experiments/OpenNMT-py/.git/log
flauted [Thu, 27 Jun 2019 11:06:22 +0000 (07:06 -0400)]
Video captioning (#1409)
* Add feature extraction tool.
* Update preprocess.
* Add training and translation.
* Adapt transformer for video.
* Add tutorial to docs.
* Add folded val files for easier 'early stop.'
* Add and document transformer.
Paul Tardy [Thu, 27 Jun 2019 10:45:10 +0000 (12:45 +0200)]
Implementing coverage loss of abisee (2017) (#1464)
* Implementing coverage loss of abisee (2017)
* fix lambda_coverage value
Vincent Nguyen [Thu, 27 Jun 2019 09:05:45 +0000 (11:05 +0200)]
Optimize AAN transformer and small fixes (#1482)
* Optimize AAN transformer and small fixes
* Make use of FFN layer in AAN an option
Soumya Sanyal [Wed, 26 Jun 2019 21:38:55 +0000 (03:08 +0530)]
Minor change in MultiHeadedAttention documentation (#1479)
* Minor change in documentation
Vincent Nguyen [Fri, 21 Jun 2019 13:39:06 +0000 (15:39 +0200)]
fix typo
max [Wed, 19 Jun 2019 15:58:23 +0000 (10:58 -0500)]
add health endpoint to server.py (#1471)
Uri Alon [Tue, 18 Jun 2019 19:54:05 +0000 (22:54 +0300)]
Fix a potential IndexError when translating with replace_unk (#1469)
* Fix IndexError which happens with replace_unk, when the argmax of the attention is on the padding instead of a real source token
max [Tue, 18 Jun 2019 19:53:29 +0000 (14:53 -0500)]
fix incorrect script path in CONTRIBUTING.md (#1470) (#1472)
Guillaume Klein [Thu, 13 Jun 2019 15:23:08 +0000 (17:23 +0200)]
Do not raise an error for missing validation data (#1467)
François Hernandez [Thu, 13 Jun 2019 09:17:52 +0000 (11:17 +0200)]
bump 0.9.1 (#1466)
François Hernandez [Thu, 13 Jun 2019 08:04:54 +0000 (10:04 +0200)]
New apex amp API (#1465)
* use new apex amp API
* make apex opt_level as option
Yasmin Moslem [Thu, 13 Jun 2019 07:44:56 +0000 (09:44 +0200)]
Change map_location to be 'cpu' (#1461)
* Change map_location to be 'cpu'
If you are on a CPU-only machine, it will give an error otherwise. Model averaging should not require a GPU; moreover, it may be faster to use CPU rather than move all models to the GPU to average them.
Vincent Nguyen [Fri, 7 Jun 2019 08:25:28 +0000 (10:25 +0200)]
Update requirements.opt.txt
Vincent Nguyen [Fri, 7 Jun 2019 08:09:52 +0000 (10:09 +0200)]
Update requirements.opt.txt
Khoa Ho [Wed, 5 Jun 2019 10:07:56 +0000 (03:07 -0700)]
Clarify mixed precision training support (#1458)
Change the wording to avoid confusion. Mixed precision ensures both higher arithmetic throughput and numerical stability, not exactly synonymous to pure half-precision/FP16 training. Also add mentioning of tensor cores since older generation GPUs without tensor cores don't support true mixed precision training.
François Hernandez [Tue, 4 Jun 2019 07:56:52 +0000 (09:56 +0200)]
[fix] Make queue.put() blocking again (#1455)
Fix #1454 .
Paul Tardy [Mon, 3 Jun 2019 18:08:59 +0000 (20:08 +0200)]
Using Producer-Consumer for batches (#1450)
* Working queues on multi-GPU on text and audio
* Working quite well, even with dynamic_dict
* Remove explicit garbage collect making some queue hang and other fixes
* fix process not ending
* properly set random seed and fill queues sequentially
* make queues work with distributed training
François Hernandez [Mon, 3 Jun 2019 12:37:17 +0000 (14:37 +0200)]
trim vocab(s) before saving checkpoint (#1453)
François Hernandez [Fri, 24 May 2019 11:45:31 +0000 (13:45 +0200)]
do not overwrite pt vocab when preprocessing again (#1447)
François Hernandez [Thu, 23 May 2019 15:49:47 +0000 (17:49 +0200)]
small fix multi when common root in data_ids (#1444)
Ignacio Brasca [Sun, 19 May 2019 17:32:11 +0000 (14:32 -0300)]
Update README.md (#1437)
François Hernandez [Fri, 17 May 2019 08:06:41 +0000 (10:06 +0200)]
make MultipleDatasetIterator only if necessary (#1436)
Vincent Nguyen [Thu, 16 May 2019 15:15:50 +0000 (17:15 +0200)]
bump version (#1434)
François Hernandez [Thu, 16 May 2019 14:27:11 +0000 (16:27 +0200)]
Preprocessing: faster build vocab + multiple weighted datasets (#1413)
* handle multiple training corpora and enable weighting
* move fields vocab building logic in function
* fix device handling MultipleDatasetIterator
* fix multi/yield_raw_batch parameter DatasetLazyIter
* update FAQ.md
* add -pool_factor option
* reduce pool_factor for travis runs
François Hernandez [Thu, 16 May 2019 14:25:27 +0000 (16:25 +0200)]
Introduce dropout scheduler (#1421)
* add update_dropout methods approx. everywhere, dropout scheduler
* more meaningful log
* forgot some layers in audio_encoder
Vincent Nguyen [Tue, 7 May 2019 10:15:38 +0000 (12:15 +0200)]
Upgrade Travis to Torch 1.1 (#1426)
Vincent Nguyen [Mon, 6 May 2019 18:32:39 +0000 (20:32 +0200)]
fix goldscore #1383 (#1423)
* fix #1383
* fix gold score only
hukuda222 [Tue, 30 Apr 2019 17:45:55 +0000 (02:45 +0900)]
fix typo (#1416)
François Hernandez [Mon, 29 Apr 2019 18:38:12 +0000 (20:38 +0200)]
add src or tgt min frequency to counter value (#1414)
Jong Hun Shin [Sat, 27 Apr 2019 19:38:11 +0000 (04:38 +0900)]
Check -gpu_ranks option to ensure saving a model (#1407)
* Check -gpu_ranks option to ensure saving a model
* split condition to check -gpu_ranks inconsistency
elisemicho [Wed, 17 Apr 2019 16:04:46 +0000 (18:04 +0200)]
fix Runtime error in Library tutorial (#1399)
Clément [Mon, 15 Apr 2019 14:43:59 +0000 (16:43 +0200)]
Avoid padding indices in MeanEncoder (#1398)
* We avoid padding while mean pooling
* placed batch dimension first for bmm
* replaced accidentally deleted line
Andrew Mulder [Fri, 12 Apr 2019 18:24:51 +0000 (14:24 -0400)]
check vocab files exist at start of preprocessing (#1396)
António Lopes [Tue, 9 Apr 2019 19:55:05 +0000 (20:55 +0100)]
Added earlystopping mechanism (#1389)
* Added earlystopping mechanism
* Fixed earlystopping multi-gpu stoppage
Vincent Nguyen [Mon, 8 Apr 2019 07:53:33 +0000 (09:53 +0200)]
Better handle Cuda OOM with overflow batches (#1385)
Andrew Mulder [Fri, 5 Apr 2019 19:46:13 +0000 (21:46 +0200)]
write attention debug to log file (#1384)
DNGros [Wed, 3 Apr 2019 20:53:00 +0000 (15:53 -0500)]
More efficient embeddings_to_torch.py (#1372)
* Update embeddings_to_torch.py to be more memory efficient by only loading vectors which are present in the vocab into memory.
* remove dead code and flake8 violations introduced with
57cefb7ee
* update docs of using Glove embeddings. Fix spelling error
flauted [Wed, 3 Apr 2019 20:52:25 +0000 (16:52 -0400)]
Have EnsembleDecoder set attentional property. (#1381)
Yasmin M [Thu, 28 Mar 2019 17:42:59 +0000 (19:42 +0200)]
Add phrase_table translation argument (#1370)
* Add phrase_table translation argument
If phrase_table is provided (with replace_unk), it will look up the identified source token and give the corresponding target token. If it is not provided (or the identified source token does not exist in the table), then it will copy the source token.
Vincent Nguyen [Thu, 28 Mar 2019 07:25:15 +0000 (08:25 +0100)]
advanced noam with decay and accum scheduler (#1367)
* advanced noam with decay and accum scheduler
lordeddard [Mon, 11 Mar 2019 14:04:06 +0000 (19:34 +0530)]
Using -src_vocab to pass vocab.pt files. (#1350)
lordeddard [Fri, 8 Mar 2019 14:53:20 +0000 (20:23 +0530)]
Option to give an existing vocab file(.pt or text) to preprocess.py (#1346)
* Option to give an existing vocab file(.pt or text) to preprocess.py
Vincent Nguyen [Fri, 8 Mar 2019 13:00:25 +0000 (14:00 +0100)]
tgt/src ratio based beam stopping condition (#1344)
* tgt/src ratio based beam stopping condition
flauted [Tue, 5 Mar 2019 15:32:29 +0000 (10:32 -0500)]
Map state on coverage if coverage is there. (#1338)
Paul Tardy [Tue, 5 Mar 2019 11:38:35 +0000 (12:38 +0100)]
Fix UTF-8 encoding error in translation server (python2 only) (#1340)
Vincent Dahmen [Wed, 27 Feb 2019 13:33:19 +0000 (14:33 +0100)]
Update requirements.txt (#1326)
* Update requirements.txt
Use pip version
* Update requirements.txt
Fahim Dalvi [Sun, 24 Feb 2019 17:05:19 +0000 (20:05 +0300)]
Fix translate.py bug (OpenNMT/OpenNMT-py#1317) (#1318)
`translate.py` would ignore all sentences after processing
`shard_size*shard_size` sentences in `src`, if no `tgt` file
was provided. This commit fixes this.
Kano Takatomo [Sun, 24 Feb 2019 14:25:59 +0000 (23:25 +0900)]
Bug fix of AudioSeqField (#1319)
* Bug fix
flauted [Tue, 19 Feb 2019 08:59:15 +0000 (03:59 -0500)]
Disable attention (#1314)
* Allow disabling attention.
* Test no/different copy attn, integrate copy into CNN dec.
* Update docs.
* Add back the 'empty' model test.
Vincent Nguyen [Sun, 17 Feb 2019 16:02:33 +0000 (17:02 +0100)]
bump v0.8.2 (#1310)
flauted [Sun, 17 Feb 2019 15:54:09 +0000 (10:54 -0500)]
Add flask dependency (#1312)
* Add flask dependency
* [Test] Try to make travis fail if docs don't build.
* Revert requirements (reinclude flask).
flauted [Sun, 17 Feb 2019 09:17:44 +0000 (04:17 -0500)]
Update docs (#1311)
* Update docs, standardize shape, add CONTRIBUTING note, use sphinx-argparse.
flauted [Sat, 16 Feb 2019 12:59:46 +0000 (07:59 -0500)]
Fix bidirectional not working. (#1306)
Guillaume Klein [Sat, 16 Feb 2019 10:24:58 +0000 (11:24 +0100)]
Fix doctr deploy (#1309)
Vincent Nguyen [Fri, 15 Feb 2019 20:23:02 +0000 (21:23 +0100)]
disable deploy doctr (#1305)
Vincent Nguyen [Fri, 15 Feb 2019 19:20:45 +0000 (20:20 +0100)]
Update onmt.modules.rst
Vincent Nguyen [Fri, 15 Feb 2019 19:17:51 +0000 (20:17 +0100)]
Update onmt.modules.rst
Vincent Nguyen [Fri, 15 Feb 2019 19:14:18 +0000 (20:14 +0100)]
Update onmt.modules.rst
Vincent Nguyen [Fri, 15 Feb 2019 19:10:15 +0000 (20:10 +0100)]
Update onmt.rst
Vincent Nguyen [Fri, 15 Feb 2019 17:55:16 +0000 (18:55 +0100)]
Update onmt.modules.rst
Vincent Nguyen [Fri, 15 Feb 2019 17:49:44 +0000 (18:49 +0100)]
Update index.rst
Vincent Nguyen [Fri, 15 Feb 2019 17:49:03 +0000 (18:49 +0100)]
Rename onmt.io.rst to onmt.inputters.rst
Vincent Nguyen [Fri, 15 Feb 2019 17:48:41 +0000 (18:48 +0100)]
Update onmt.io.rst
Vincent Nguyen [Fri, 15 Feb 2019 17:45:23 +0000 (18:45 +0100)]
Update onmt.rst
elisemicho [Fri, 15 Feb 2019 17:10:02 +0000 (18:10 +0100)]
fix onmt as library example (#1292)
Vincent Nguyen [Fri, 15 Feb 2019 15:05:17 +0000 (16:05 +0100)]
Allow forcing FP32 at inference of FP16 models (#1303)
flauted [Fri, 15 Feb 2019 14:43:46 +0000 (09:43 -0500)]
Fix ensembling again. (#1302)
flauted [Fri, 15 Feb 2019 14:13:33 +0000 (09:13 -0500)]
Simplify fields structure (#1299)
flauted [Fri, 15 Feb 2019 13:54:08 +0000 (08:54 -0500)]
Fix ensemble decoding. (#1301)
flauted [Fri, 15 Feb 2019 07:56:35 +0000 (02:56 -0500)]
Revamp args (#1287)
* Factor out validation and default opt getting
* Update test_models to use new parser.
* Remove unnecessary cast.
* Test translation server, format its docs.
* Fix Py27 compatibility.
* Actually fix Py27? And start testing TranslationServer.
flauted [Thu, 14 Feb 2019 09:11:05 +0000 (04:11 -0500)]
Add from_opt constructor to Translator. (#1296)
François Hernandez [Wed, 13 Feb 2019 17:29:29 +0000 (18:29 +0100)]
Compute and store moving average in fp32 (#1294)
* compute and store moving average in fp32
Vincent Nguyen [Tue, 12 Feb 2019 08:36:52 +0000 (09:36 +0100)]
Bump version 0.8.1 (#1286)
flauted [Tue, 12 Feb 2019 07:53:27 +0000 (02:53 -0500)]
Random sampling scores (#1285)
* In random sampling, make scores be score of EOS.
flauted [Mon, 11 Feb 2019 14:03:45 +0000 (09:03 -0500)]
Better buffering, readability in beam_search; random sampling class (#1281)
* Better buffering, readability, possible bug fix for return_attention
* Update docs.
* Test with return attention.
* Update docs, translation device selection, refactor translation strategies.
Charles Lovering [Mon, 11 Feb 2019 08:37:03 +0000 (03:37 -0500)]
Remove unused command line argument. (#1283)
There is an unused argument in an example that seems to have been refactored out.
Vincent Nguyen [Sun, 10 Feb 2019 08:57:30 +0000 (09:57 +0100)]
Update requirements.opt.txt
Vincent Nguyen [Sun, 10 Feb 2019 08:29:20 +0000 (09:29 +0100)]
Update requirements.opt.txt
Vincent Nguyen [Sat, 9 Feb 2019 21:05:07 +0000 (22:05 +0100)]
update docs (#1282)
Vincent Nguyen [Sat, 9 Feb 2019 17:47:55 +0000 (18:47 +0100)]
Bump version 0.8.0 (#1280)
François Hernandez [Sat, 9 Feb 2019 16:35:34 +0000 (17:35 +0100)]
fix save moving average (#1279)
flauted [Sat, 9 Feb 2019 16:17:22 +0000 (11:17 -0500)]
Implement coverage penalty in fast beam search (#1277)
* Test length penalty.
* Fix #994 - now seems length penalty scores are consistent for fast and slow.
* Get matching summary cov score on fast.
* Get stepwise coverage penalty scores matching.
* Better document beam search.
flauted [Sat, 9 Feb 2019 15:23:39 +0000 (10:23 -0500)]
Remove non-base datasets. (#1275)
* Remove non-base datasets.
* Update Dataset documentation.
* Move helper methods out of Dataset and document them.
* Remove , replace w explicit constructor calls.
Guillaume Klein [Sat, 9 Feb 2019 13:38:31 +0000 (14:38 +0100)]
Freeze tqdm version (#1278)
Guillaume Klein [Sat, 9 Feb 2019 11:41:46 +0000 (12:41 +0100)]
Experimental FP16 training (#1208)
* Experimental FP16 training
* Make batch size a multiple of 8 in case of FP16 training
* Force max_generator_batches to 0 in FP16 training
* Mark FP16 as experimental
* Make FP16 loss scale configurable
* Support FusedAdam from APEX
* Fix generator inspection index
Vincent Nguyen [Sat, 9 Feb 2019 09:35:43 +0000 (10:35 +0100)]
doc clean-ups, remove old beam (#1268)
* doc clean-ups, rename leagacy<>fast
* fix remove globalattn from transformer
* remove old beam calls -now fast is default
* keep old beam for coverage_penalty for now
François Hernandez [Fri, 8 Feb 2019 11:08:18 +0000 (12:08 +0100)]
Moving average of model parameters (#1273)
* naive moving average on gpu
* validate with moving average if exists
* save moving average parameters instead of last update
* add average_every option
Guillaume Klein [Fri, 8 Feb 2019 08:59:33 +0000 (09:59 +0100)]
Training option to iterate on the dataset once (#1274)
* Training option to iterate on the dataset once
* Add warning when single_pass is used but train_steps is non zero
Guillaume Klein [Thu, 7 Feb 2019 16:04:28 +0000 (17:04 +0100)]
Let the trainer decide when to save checkpoints (#1272)
In particular, the trainer should always save a checkpoint on the last
iteration. This PR also adds support for "-save_checkpoint_steps 0" to
only save the last checkpoint.
flauted [Thu, 7 Feb 2019 14:57:47 +0000 (09:57 -0500)]
Refactor, integrate penalties and global scorer into fast translation (#1270)
* Refactor, integrate penalties and global scorer into fast xlation.
* Update GlobalScorerStub for beam_search tests.
* Change min-length behavior, update tests, and document it.
Guillaume Klein [Thu, 7 Feb 2019 09:18:33 +0000 (10:18 +0100)]
Simplify the training loop (#1265)
* Simplify the training loop
* The outer `while steps <= train_steps` is not useful as the dataset
yields an infinite iterator
* Multi-GPU sharding and batch accumulation can be expressed in terms
of iterator wrappers
* Remove reduce_counter variable
* Rely on optim.training_step instead of manual increments
flauted [Thu, 7 Feb 2019 07:25:42 +0000 (02:25 -0500)]
Better documentation for data pipeline (#1269)
* Partial update to docs.
* Improve inputter documentation.
* Reverse FileNotFoundError instead of RuntimeError.
François Hernandez [Wed, 6 Feb 2019 13:11:45 +0000 (14:11 +0100)]
Relative Position Representations (#1244)
* add relative position representations for self attention.
Guillaume Klein [Tue, 5 Feb 2019 17:38:58 +0000 (18:38 +0100)]
Cleanup .travis.yml (#1266)
Guillaume Klein [Tue, 5 Feb 2019 14:44:52 +0000 (15:44 +0100)]
Make validation set optional (#1264)
* Make validation set optional
* Log if training is run with validation or not
flauted [Tue, 5 Feb 2019 09:22:41 +0000 (04:22 -0500)]
Refactor & test fast beam search method, add ngram blocking (#1258)
* Test beam advance function.
* Refactor and test fast beam search method.
* Add block_ngram to fast translate, rename test_beam_search, fix semantically wrong tests.
Guillaume Klein [Mon, 4 Feb 2019 10:26:51 +0000 (11:26 +0100)]
Only decode binary string type in TextDataset (#1260)
Vincent Nguyen [Mon, 4 Feb 2019 07:26:27 +0000 (08:26 +0100)]
Update train.md
flauted [Mon, 4 Feb 2019 07:21:29 +0000 (02:21 -0500)]
Datareaders (#1221)
* Start abstracting out dataset readers.
* Remove make_examples in favor of Reader.read
* Uniform spacing around imports.
* _check_deps as a classmethod of the reader.
* Move reader.read calls into DatasetBase init.
* Add 'empty' data reader __init__ to reader's base; delete from TextDataReader.
* Make readers a class attribute of DatasetBase instead of passing as args.
* Revert "Make readers a class attribute of DatasetBase instead of passing as args."
This reverts commit
cc8cc980539cd95a45461ab5b1e4c9b4904ba29b .
* Add from_opt to readers; undo __init__ taking all the args.
* Add tests for data readers.
Guillaume Klein [Fri, 1 Feb 2019 17:01:24 +0000 (18:01 +0100)]
Reintroduce partial backward if max_generator_batches > 0 (#1254)
The removal changed the memory requirements of the training which
could possibly break existing user configurations.
This also reintroduces the unwanted side-effect that the loss
computation initiates the backward pass unless max_generator_batches
is 0.