1. Adversarial Attack ?
in CV
with gpt4
prompt changes Imperceptibly
llm evaluation result overturned.
what microsoft has done?
2. Evaluation dataset:
alpaca_farm
3. give reference / reason
like PandaLM
given two different LLM response, how to make sure the judgement is reasonable?
- just give your reason