I have the output from the inference step of running run_gpt_v1.5_gpt4_1106_cot.py saved in the appropriate directory.
When I get to the scoring step, fread returns an empty list for the data object:
class Scorer:
def __init__(self, files, data_list_of_dicts=None):
if not len(files):
print('No files for evaluation')
import sys
sys.exit()
from efficiency.log import fread
data_list = []
for file in sorted(files):
data = fread(file)
data_list += data
print(file, len(data))
The fread intended behavior is to return an empty list when read_csv errors out. This is not good practice - very difficult to diagnose why the user gets an empty dataframe and eventual error here, especially since fread is programmed in another library:
def truth_pred_scorer(self, df):
df.drop(['prompt', 'question_id'], axis=1, inplace=True)
df = self.apply_score_func(df)
The error occurs because the specified columns for dropping aren't in the dataframe, since it is empty.
I'm able to import my file by specifying a special encoding at line 258 of efficiency.logpy`:
data = pd.read_csv(path, encoding = 'cp1252').to_dict(orient="records")
But the fact that I have to modify another library highlights why this isn't ideal.
I have the output from the inference step of running
run_gpt_v1.5_gpt4_1106_cot.pysaved in the appropriate directory.When I get to the scoring step,
freadreturns an empty list for the data object:The fread intended behavior is to return an empty list when
read_csverrors out. This is not good practice - very difficult to diagnose why the user gets an empty dataframe and eventual error here, especially since fread is programmed in another library:The error occurs because the specified columns for dropping aren't in the dataframe, since it is empty.
I'm able to import my file by specifying a special encoding at line 258 of efficiency.logpy`:
But the fact that I have to modify another library highlights why this isn't ideal.