]>
Commit | Line | Data |
---|---|---|
1bbe9c7c | 1 | # Methodology |
85596d7c | 2 | |
1bbe9c7c | 3 | ## Methodology |
2357691c VD |
4 | - We will try to reproduce the results from \textsuperscript{[Sennrich et. al, 2016]} with multiple data sets |
5 | - Reuse all settings for net training | |
6 | - Use 3 different data sets: | |
7 | 1) [opensubtitles 2016](https://obj.umiacs.umd.edu/mt-data/OpenSubtitles2016.en-fr.clean.tgz) (original data) | |
8 | 2) [GYAFC](https://github.com/raosudha89/GYAFC-corpus) (alternate public data) | |
9 | 3) [PhraseApp](https://phraseapp.com) (industry data) | |
10 | ||
11 | - A lot of work is based on the same non representative data sets | |
12 | - **Lets change that!** | |
13 |