정보검색과 텍스트마이닝, NLP와 머신러닝 | (9/12) 실습 1. 파일 내용 전체를 메모리로 loading --＞ 8진수, 16진수로 출력하기 - Daum 카페

<p><br></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D6%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile287.uf.daum.net%2Fattach%2F99CFCC3359A76EE12AE76C')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> loadfile.c</a></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D7%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile269.uf.daum.net%2Fattach%2F99DB103359A76EF0355191')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> loadfile.exe</a></p><p><span style="font-size: 12pt;"><br></span></p><p><span style="font-size: 12pt;">/*</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">입력 파일 전체를 memory로 load</span></p><p><span style="font-size: 12pt;">*/</span></p><p><span style="font-size: 12pt;">unsigned char *loadFile(char *filename)</span></p><p><span style="font-size: 12pt;">{</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">long n;</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">unsigned char *p;</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">FILE *fp;</span></p><p><br></p><p><span style="color: rgb(255, 0, 0); font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">fp = fopen(filename, "rb");</span><span style="color: rgb(255, 0, 0); font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">// [주의] 'fp'를 "rb"로 open</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">if (!fp) {</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(0, 0, 0); font-size: 12pt;">p = (unsigned char *) malloc(strlen(filename)+1);</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">strcpy(p, filename);</span><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">// file명 자체를 return</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">return p;</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">}</span></p><p><br></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">fseek(fp, 0L, 2);</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">n = ftell(fp);</span><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">// n: byte size of file 'fp'</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">if (n < 3) return NULL;</span></p><p><br></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">p = (unsigned char *) malloc(n+1);</span><span style="color: rgb(255, 0, 0); font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">// memory allocation</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">if (p == NULL) return NULL;</span></p><p><br></p><p><span style="font-size: 12pt;">//</span><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">memset(p, 0, n+1);</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">fseek(fp, 0L, 0);</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">fread(p, sizeof(unsigned char), n, fp);</span><span style="color: rgb(255, 0, 0); font-size: 12pt; white-space: pre;"> </span><span style="color: rgb(255, 0, 0); font-size: 12pt;">// read 'fp' to 'p'</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">*(p+n) = '\0';</span></p><p><br></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">fclose(fp);</span></p><p><span style="font-size: 12pt; white-space: pre;"> </span><span style="font-size: 12pt;">return p;</span></p><p><span style="font-size: 12pt;">}</span></p><p><span style="font-size: 12pt;">//============================================</span></p><p><span style="font-size: 12pt;"><br></span></p><p><span style="font-size: 12pt;">//</span><span style="font-size: 16px;">==== test.txt 파일 내용</span></p><p><span style="font-size: 16px;">ABCDabcd0123가각간갇갈</span></p><p><br></p><p><span style="font-size: 12pt;">//==== </span><span style="font-size: 16px;">$ od -x test.txt   // </span><span style="font-size: 12pt;">16진수로 출력</span></p><p><span style="font-size: 16px;">0000000 4241 4443 6261 6463 3130 3332 a1b0 a2b0</span></p><p><span style="font-size: 16px;">0000020 a3b0 a4b0 a5b0 0a0d</span><span style="font-size: 16px;"> </span><span style="font-size: 16px;">0a0d</span></p><p><span style="font-size: 16px;">0000032</span><br></p><p><span style="font-size: 16px;"><br></span></p><p><span style="font-size: 12pt;">//==== $ od -c test.txt</span><br></p><div><div><span style="font-size: 16px;">0000000   A   B   C   D   a   b   c   d   0   1   2   3 260 241 260 242</span></div><div><span style="font-size: 16px;">0000020 260 243 260 244 260 245  \r  \n  \r  \n</span></div></div><p><span style="font-size: 16px;">0000032</span><br></p><p><span style="font-size: 16px;"><br></span></p><p><span style="font-size: 16px;">//</span><span style="font-size: 16px;">==== $ od -o test.txt</span><span style="font-size: 16px;"> </span></p><p><span style="font-size: 16px;">0000000 041101 042103 061141 062143 030460 031462 120660 121260</span></p><p><span style="font-size: 16px;">0000020 121660 122260 122660 005015 005015</span></p><p><span style="font-size: 16px;">0000032</span></p><p><span style="font-size: 16px;"><br></span></p><p><span style="font-size: 11pt;"><참고> 아래 파일들은 test.txt로부터 "윈도 메모장"을 이용하여 유니코드로 저장하였음.</span></p><p><span style="font-size: 11pt;">   리눅스에서는 'iconv' 명령으로 코드변환 --> 차이점 BOM 표시</span></p><p><span style="font-size: 11pt;"><br></span></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D2%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile276.uf.daum.net%2Fattach%2F99F8CB3359A76CB7284824')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> test.txt</a></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D3%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile245.uf.daum.net%2Fattach%2F99FC643359A76CCE2AE17A')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> test-BE.txt</a></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D4%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile274.uf.daum.net%2Fattach%2F9968073359A76CE02AC611')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> test-LE.txt</a></p><p><a href="javascript:checkVirus('grpid%3D1Rkrq%26fldid%3DVPYX%26dataid%3D1%26fileid%3D5%26regdt%3D20170804162812&url=http%3A%2F%2Fcfile284.uf.daum.net%2Fattach%2F9911803359A76CFD122E5C')"><img src="https://t1.daumcdn.net/daumtop_deco/icon/icon.hanmail.net/editor/p_etc_s.gif" border="0" alt="첨부파일" class="vam"/> test-UTF8.txt</a></p><p><span style="font-size: 16px;"><br></span></p><p><span style="color: rgb(9, 0, 255); font-size: 16px;"><실습> test.txt를 iconv를 이용하여 코드변환 했을 때 위 윈도에서 메모장으로 변환 결과와 차이점은 무엇인가?</span></p><p><span style="color: rgb(9, 0, 255); font-size: 16px;"><br></span></p><p><span style="color: rgb(9, 0, 255); font-size: 16px;"><참고> DOS(윈도)와 리눅스에서 작성한 텍스트 파일은 '줄바꿈 문자' 부분에 차이가 있음.</span></p><p><span style="color: rgb(9, 0, 255); font-size: 16px;"> DOS(윈도)의 경우 CR-LF, 리눅스(unix계열)는 LF</span></p>

카페정보

정보검색과 텍스트마이닝, NLP와 머신러닝

카페 전체 메뉴

▲

카페 게시글

목록 이전글 다음글

실습및과제2017 (9/12) 실습 1. 파일 내용 전체를 메모리로 loading --＞ 8진수, 16진수로 출력하기

nlp 추천 0 조회 301 17.08.04 16:28 댓글 3

게시글 본문내용

다음검색

첨부된 파일 개 ▼

저작자 표시 컨텐츠변경 비영리

댓글

nlp
작성자 17.08.04 16:33

첫댓글 [실습 1] dump한 파일 내용 전체를 8진수 또는 16진수로 출력하는 프로그램 작성 --> linux의 od 명령
[실습 2] dump한 파일의 word count --> linux의 wc 명령
nlp
작성자 17.08.31 11:58

iconv 사용 예
$ iconv --help
$ iconv --list 또는 iconv -l

$ iconv -f utf8 t- euckr test.txt euckr.txt
$ iconv -f utf8 t- cp949 test.txt euckr.txt

<참고> iconv 라이브러리(libiconv.a)를 이용하여 C언어 main() 함수에서 iconv() 함수를 직접 호출하는 방법은?
iconv_open(), iconv(), iconv_close()
nlp
작성자 17.09.12 14:19

"$ od -o" 8진수 출력은... 역워드 형식의 16비트(HDD에 기록된 순서)를 3비트씩 끊어서 8진수로 출력하였음!
예) "AB"는 역워드 형식에서 0x4241 이고, 2진수 0100 0010 0100 0001 --> 0 100 001 001 000 001 (ox041101)

검색 옵션 선택상자

댓글내용선택됨 옵션 더 보기

댓글내용

댓글 작성자

연관검색어

환율

환자

환기

최신목록