BERT, a deep bidirectional transformer architecture, builds a language model that achieved the state-of-art performance in several general NLP tasks, mainly due to its self-attention mechanism that captures rich, context-sensitive word and sentence semantics from a large corpus. However, a task like news search can be sensitive to a change of word meanings arising after a major event (e.g. the meaning representation of \Las Vegas" after the shooting event). In order to capture such changes and generate a search-oriented language model, we propose two dierent language models. Our rst model applies the global term weighting scaling to the embedding layer so that the new BERT model can capture more relevant relationships than before. We also proposed a new relevance-based attention head method that attempts to incorporate user relevance decision and hence perspective changes reected in click-through data. Based on a series of experiments, we show that our second model can capture relevance relationship between words better and give superior retrieval performance in news retrieval.