CUDA_VISIBLE_DEVICES=6 /home/jiangjinhao/anaconda3/envs/pt1.8-transformer4.18/bin/python main_nsm.py --diff_lr --linear_dropout 0.0 --log_steps 300 \ --model_path ...
Extracts page-length chunks from a PDF file using PyMuPDF4LLM. Returns a list of dicts (one per page). Processes all PDF files in the given folder. For each PDF file that does not yet have a ...