利用Hadoop API读写hdfs上的文件(可指定文件编码)

1. 读文件

BufferReader(可以指定文件编码):

 private static void readHdfsFile(FileSystem fs, Path path)
            throws IOException {
        DataInputStream dis = new DataInputStream(fs.open(path));
        BufferedReader br = new BufferedReader(new InputStreamReader(dis,"utf-8"));
        String line = "";
        while ((line = br.readLine()) != null) {
            System.out.println(line);
        }
        dis.close();
        br.close();
    }

OutputStream:(一次读入整个文件)

    private static String readHdfsFile2(FileSystem fs, Path path, String charset)
            throws IOException {
        FSDataInputStream hdfsInStream = fs.open(path);
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        byte[] ioBuffer = new byte[1024];
        int readLen = hdfsInStream.read(ioBuffer);
        while (-1 != readLen) {
            bos.write(ioBuffer, 0, readLen);
            readLen = hdfsInStream.read(ioBuffer);
        }
        hdfsInStream.close();
        return new String(bos.toByteArray(), charset);
    }

2. 写文件
Buffered Writer

        String content = "hello world";
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(
                fs.create(new Path("/test/tmp/file"), true)));
        bw.write(content);
        bw.close();

OutputStream

        String content = "output stream";
        FSDataOutputStream dos = fs.create(new Path("/test/tmp/file2"), true);
        dos.writeBytes(content);
        dos.close();

发表评论

电子邮件地址不会被公开。 必填项已用*标注