In the training of large-scale language models, 'Reinforcement Learning from Human Feedback ( RLHF)' is performed, which reflects evaluations by actual humans in the output of the model. However, ...