# tlm **Repository Path**: qaq123123/tlmt ## Basic Information - **Project Name**: tlm - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-12-13 - **Last Updated**: 2023-12-14 ## Categories & Tags **Categories**: Uncategorized **Tags**: tlm ## README # TLM: Token-Level Masking for Transformers (EMNLP2023) The public code.\ In this paper, we propose a new regularization scheme based on token-level rather than structure-level to reduce overfitting. Specifically, we devise a novel Token-Level Masking (TLM) training strategy for Transformers to regularize the connections of self-attention, which consists of two masking techniques that are effective and easy to implement. The underlying idea is to manipulate the connections between tokens in the multi-head attention via masking, where the networks are forced to exploit partial neighbors’ information to produce a meaningful representation. ## Note we release the code => BERT-base (bert-base-cased) with TLM on RTE - if you do not use attention dropout, set attention_probs_dropout_prob=0 - if you do not use tlm, set --use_tlm=0 ## Installation ```shell git clone https://github.com/Young1993/tlm.git cd tlm/ pip install -r requirements.txt ``` ### Use TLM ```shell cd glue sh bert-base_sh/rte_train_tlm.sh ``` ### without TLM cd glue sh bert-base_sh/rte_train.sh ### Notice # Todo - ✅ Run Bert-base with TLM - To test QWen/LLama/sqlcoder with TLM #Citation ```text @inproceedings{wu-etal-2023-tlm, title = "{TLM}: Token-Level Masking for Transformers", author = "Wu, Yangjun and Fang, Kebin and Zhang, Dongxiang and Wang, Han and Zhang, Hao and Chen, Gang", booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.emnlp-main.871", pages = "14099--14111" } ``` [TLM: Token-Level Masking for Transformers](https://aclanthology.org/2023.emnlp-main.871) (Wu et al., EMNLP 2023) If you have any questions, please feel free to talk with me (email: yangjun.wu@connect.polyu.hk)