IO流
JavaIO流

IO流其他内容

# IO流其他内容

# 1. 位、字节、字符

字节(Byte)是计量单位，表示数据量多少，是计算机信息技术用于计量存储容量的一种计量单位，通常一个字节等于八位。

字符(Character)计算机中使用的字母、数字、字和符号，比如’A’、‘B’、’$’、’&'等。

一般在英文状态下一个字母或字符占用一个字节，一个汉字用两个字节表示。

字节与字符：

ASCII 码中，一个英文字母（不分大小写）为一个字节，一个中文汉字为两个字节。
UTF-8 编码中，一个英文字为一个字节，一个中文为三个字节。
Unicode 编码中，一个英文为一个字节，一个中文为两个字节。
符号：英文标点为一个字节，中文标点为两个字节。例如：英文句号 . 占1个字节的大小，中文句号。占2个字节的大小。
UTF-16 编码中，一个英文字母字符或一个汉字字符存储都需要 2 个字节（Unicode 扩展区的一些汉字存储需要 4 个字节）。
UTF-32 编码中，世界上任何字符的存储都需要 4 个字节。

# 2. IO流效率对比

首先，对比下普通字节流和缓冲字节流的效率：

目标：比较普通字节流（FileOutputStream）与缓冲字节流（BufferedOutputStream）写入大量数据时的耗时。
数据：生成一个长字符串，包含约300万个字符，转换成字节数组后写入文件。
方法：
- 使用FileOutputStream直接写入字节数据。
- 使用BufferedOutputStream包装FileOutputStream后写入相同的字节数据。

public class MyTest {
    public static void main(String[] args) throws IOException {
        File file = new File("E:/2023-IO/test.txt");
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 3000000; i++) {
            sb.append("abcdefghigklmnopqrstuvwsyz");
        }
        byte[] bytes = sb.toString().getBytes();

        // 普通字节流写入
        long start = System.currentTimeMillis();
        write(file, bytes);
        long end = System.currentTimeMillis();

        // 缓冲字节流写入
        long start2 = System.currentTimeMillis();
        bufferedWrite(file, bytes);
        long end2 = System.currentTimeMillis();

        System.out.println("普通字节流耗时：" + (end - start) + " ms");
        System.out.println("缓冲字节流耗时：" + (end2 - start2) + " ms");
    }

    // 普通字节流写入方法
    public static void write(File file, byte[] bytes) throws IOException {
        OutputStream os = new FileOutputStream(file);
        os.write(bytes);
        os.close();
    }

    // 缓冲字节流写入方法
    public static void bufferedWrite(File file, byte[] bytes) throws IOException {
        BufferedOutputStream bo = new BufferedOutputStream(new FileOutputStream(file));
        bo.write(bytes);
        bo.close();
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37

运行结果：

普通字节流耗时：85 ms
缓冲字节流耗时：80 ms

1
2

这个结果让我大跌眼镜，不是说好缓冲流效率很高么？要知道为什么，只能去源码里找答案了。翻看字节缓冲流的write方法：

public synchronized void write(byte b[], int off, int len) throws IOException {
    if (len >= buf.length) {
        /* If the request length exceeds the size of the output buffer,
           flush the output buffer and then write the data directly.
           In this way buffered streams will cascade harmlessly. */
        flushBuffer();
        out.write(b, off, len);
        return;
    }
    if (len > buf.length - count) {
        flushBuffer();
    }
    System.arraycopy(b, off, buf, count, len);
    count += len;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

注释里说得很明白：如果请求长度超过输出缓冲区的大小，刷新输出缓冲区，然后直接写入数据。这样，缓冲流将无害地级联。

总结

小批量数据写入时，缓冲流能显著提高效率，因为减少了物理写操作的次数。
大批量数据写入时，缓冲流并不比普通的输出流更高效，因为它会直接将大批量数据写入下游，而不是使用缓冲区。在这种情况下，使用缓冲流的主要好处不在于性能提升，而是在于API的一致性和使用上的便利。

基于上面的情形，要想对比普通字节流和缓冲字节流的效率差距，就要避免直接读写较长的字符串，于是，设计了下面这个对比案例：用字节流和缓冲字节流分别复制文件。

public class MyTest {
    public static void main(String[] args) throws IOException {
        // 定义源文件和目标文件
        File data = new File("C:/Mu/data.zip");
        File a = new File("C:/Mu/a.zip");
        File b = new File("C:/Mu/b.zip");

        // 记录普通字节流复制文件的开始和结束时间
        long start = System.currentTimeMillis();
        copy(data, a);
        long end = System.currentTimeMillis();

        // 记录缓冲字节流复制文件的开始和结束时间
        long start2 = System.currentTimeMillis();
        bufferedCopy(data, b);
        long end2 = System.currentTimeMillis();

        // 输出耗时
        System.out.println("普通字节流耗时：" + (end - start) + " ms");
        System.out.println("缓冲字节流耗时：" + (end2 - start2) + " ms");
    }

    // 普通字节流复制文件的方法
    public static void copy(File in, File out) throws IOException {
        InputStream is = new FileInputStream(in);
        OutputStream os = new FileOutputStream(out);
        
        int by;
        // 每次读写单个字节，没有使用缓冲区
        while ((by = is.read()) != -1) {
            os.write(by);
        }
        is.close();
        os.close();
    }

    // 缓冲字节流复制文件的方法
    public static void bufferedCopy(File in, File out) throws IOException {
        BufferedInputStream bi = new BufferedInputStream(new FileInputStream(in));
        BufferedOutputStream bo = new BufferedOutputStream(new FileOutputStream(out));
        
        int by;
        // 利用内部缓冲区提高读写效率
        while ((by = bi.read()) != -1) {
            bo.write(by);
        }
        bo.close();
        bi.close();
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

运行结果：

普通字节流耗时：39566 ms
缓冲字节流耗时：52 ms

1
2

这次，普通字节流和缓冲字节流的效率差异就很明显了，达到了760倍。

再看看字符流和缓冲字符流的效率对比：

public class IOTest {
    // 数据准备，向文件中写入大量数据
    public static void dataReady() throws IOException {
        StringBuilder sb = new StringBuilder();
        for (int i = 0; i < 600000; i++) {
            sb.append("abcdefghijklmnopqrstuvwxyz");
        }
        OutputStream os = new FileOutputStream(new File("C:/Mu/data.txt"));
        os.write(sb.toString().getBytes());
        os.close();
        System.out.println("数据准备完毕");
    }
    
    // 使用普通字符流（不使用数组）复制文件
    public static void copy(File in, File out) throws IOException {
        Reader reader = new FileReader(in);
        Writer writer = new FileWriter(out);

        // 每次读写单个字符
        int ch;
        while ((ch = reader.read()) != -1) {
            writer.write((char) ch);
        }
        reader.close();
        writer.close();
    }

    // 使用普通字符流（使用字符数组）复制文件
    public static void copyChars(File in, File out) throws IOException {
        Reader reader = new FileReader(in);
        Writer writer = new FileWriter(out);

        // 使用字符数组缓冲，提高读写效率
        char[] chs = new char[1024];
        while ((reader.read(chs)) != -1) {
            writer.write(chs);
        }
        reader.close();
        writer.close();
    }

    // 使用缓冲字符流复制文件
    public static void bufferedCopy(File in, File out) throws IOException {
        BufferedReader br = new BufferedReader(new FileReader(in));
        BufferedWriter bw = new BufferedWriter(new FileWriter(out));

        // 使用readLine读取一行数据，newLine写入换行，提高效率
        String line;
        while ((line = br.readLine()) != null) {
            bw.write(line);
            bw.newLine();
            bw.flush(); // 刷新缓冲区，确保数据全部写出
        }
        br.close();
        bw.close();
    }

    public static void main(String[] args) throws IOException {
        // 数据准备
        dataReady();

        File data = new File("C:/Mu/data.txt");
        File a = new File("C:/Mu/a.txt");
        File b = new File("C:/Mu/b.txt");
        File c = new File("C:/Mu/c.txt");

        // 比较不同方式复制文件的耗时和文件大小
        long start = System.currentTimeMillis();
        copy(data, a);
        long end = System.currentTimeMillis();

        long start2 = System.currentTimeMillis();
        copyChars(data, b);
        long end2 = System.currentTimeMillis();

        long start3 = System.currentTimeMillis();
        bufferedCopy(data, c);
        long end3 = System.currentTimeMillis();

        System.out.println("普通字符流(不使用数组)耗时：" + (end - start) + " ms, 文件大小：" + a.length() / 1024 + " kb");
        System.out.println("普通字符流(使用字符数组)耗时：" + (end2 - start2) + " ms, 文件大小：" + b.length() / 1024 + " kb");
        System.out.println("缓冲字符流(每次读取一行)耗时：" + (end3 - start3) + " ms, 文件大小：" + c.length() / 1024 + " kb");
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84

运行结果：

普通字符流(不使用数组)耗时：794 ms, 文件大小：15234 kb
普通字符流(使用字符数组)耗时：45 ms, 文件大小：15235 kb
缓冲字符流(每次读取一行)耗时：74 ms, 文件大小：15234 kb

1
2
3

测试多次，结果差不多，可见字符缓冲流效率上并没有明显提高，我们更多的是要使用它的readLine()和newLine()方法。

# 3. 重新认识System.out和Scanner

# 3.1 PrintStream类

我们每天都在用的System.out对象是PrintStream类型的。它也是IO流对象。

PrintStream 为其他输出流添加了功能，使它们能够方便地打印各种数据值表示形式。它还提供其他两项功能。与其他输出流不同，PrintStream 永远不会抛IOException；另外，PrintStream 可以设置自动刷新。

PrintStream(File file) ：创建具有指定文件且不带自动行刷新的新打印流。
PrintStream(File file, String csn)：创建具有指定文件名称和字符集且不带自动行刷新的新打印流。
PrintStream(OutputStream out) ：创建新的打印流。
PrintStream(OutputStream out, boolean autoFlush)：创建新的打印流。 autoFlush如果为 true，则每当写入 byte 数组、调用其中一个println、printf或写入换行符或字节 ('\n') 时都会刷新输出缓冲区。
PrintStream(OutputStream out, boolean autoFlush, String encoding) ：创建新的打印流。
PrintStream(String fileName)：创建具有指定文件名称且不带自动行刷新的新打印流。
PrintStream(String fileName, String csn) ：创建具有指定文件名称和字符集且不带自动行刷新的新打印流。

import java.io.FileOutputStream;
import java.io.PrintStream;

public class PrintStreamExample {
    public static void main(String[] args) {
        try {
            // 使用文件路径创建PrintStream对象，开启自动刷新
            FileOutputStream out = new FileOutputStream("example.txt");
            PrintStream ps = new PrintStream(out, true,"UTF-8");

            // 打印字符串到文件
            ps.println("Hello, PrintStream!");

            // 打印整数到文件
            ps.println(123);

            // 打印布尔值到文件
            ps.println(true);

            // 使用printf格式化字符串并打印
            ps.printf("Name: %s, Age: %d", "Alice", 30);

            // 关闭流，释放资源
            ps.close();

            // 将System.out重定向到文件
            PrintStream fileOut = new PrintStream("system_out.txt");
            System.setOut(fileOut);

            // 现在System.out.println输出到"system_out.txt"
            System.out.println("This message is redirected to a file.");

            // 释放资源
            fileOut.close();
        } catch (FileNotFoundException e) {
            System.err.println("File not found: " + e.getMessage());
        } catch (Exception e) {
            System.err.println("An error occurred: " + e.getMessage());
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

如果不开启自动刷新（`autoFlush`）什么时候写入文件呢？

缓冲区满时自动写入：如果内置的缓冲区填满了，系统会自动将缓冲区的数据写入文件。缓冲区的大小依赖于具体实现，通常是足够大的，以存储大量数据。
调用flush方法：可以手动调用PrintStream的flush方法来强制将缓冲区内的数据立即写入文件，而不必等到缓冲区满。
调用close方法：关闭流时，会自动调用flush方法将缓冲区内的任何剩余数据写入文件，然后释放与流相关联的任何系统资源。因此，即使没有开启自动刷新，正常关闭流也能保证数据完整写入文件。

# 3.2 Scanner类

Scanner类是java.util包中的一部分，它提供了解析基本类型和字符串的方法，能够解析从文本文件、输入流、字符串等多种数据源读取的原始数据。Scanner是读取输入数据的一个非常强大而灵活的方式，特别适合用于解析和读取用户输入的场景。

构造方法

Scanner(File source)：通过指定的文件创建一个Scanner，用于从文件中读取数据。
Scanner(File source, String charsetName)：通过指定的文件和字符集创建一个Scanner，用于从文件中按给定的字符编码读取数据。
Scanner(InputStream source)：通过指定的输入流创建一个Scanner，用于从输入流中读取数据。
Scanner(InputStream source, String charsetName)：通过指定的输入流和字符集创建一个Scanner，用于按给定的字符编码从输入流中读取数据。

常用方法

boolean hasNextXxx()：检查扫描器的输入中是否还有下一个符合Xxx格式的数据。Xxx可以是各种基本类型，如Int、Double等，表示检查是否还有下一个整数、浮点数等。
Xxx nextXxx()：读取扫描器输入中的下一个Xxx类型的数据并返回。Xxx同样可以是各种基本类型，如nextInt()会读取下一个整数。

import java.io.*;
import java.util.Scanner;

public class ScannerDemo {

    public static void main(String[] args) {
        writeToTextUsingScanner();
        readFromTextUsingScanner();
    }

    /**
     * 从控制台读取用户输入并写入到文本文件。
     */
    public static void writeToTextUsingScanner() {
        try (Scanner input = new Scanner(System.in);
             PrintStream ps = new PrintStream("1.txt")) {
            System.out.println("请输入文本（输入'stop'结束）：");
            while (true) {
                String line = input.nextLine();
                // 当用户输入"stop"时，结束写入并退出循环
                if ("stop".equals(line)) {
                    break;
                }
                ps.println(line); // 将用户输入写入到文件
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }

    /**
     * 从文本文件读取内容并在控制台输出。
     */
    public static void readFromTextUsingScanner() {
        // 使用try-with-resources语句确保资源被自动关闭
        try (Scanner scanner = new Scanner(new FileInputStream("1.txt"))) {
            System.out.println("文件内容：");
            while (scanner.hasNextLine()) { // 判断是否还有下一行
                String line = scanner.nextLine(); // 读取下一行内容
                System.out.println(line); // 在控制台输出
            }
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

# 3.3 System类的三个IO流对象

在Java中，System类提供了三个预定义的流（System.out、System.in和System.err），这些流对于进行控制台输入输出（即命令行I/O）非常重要。

System.out：

用途：标准输出流，用于输出信息到控制台。
类型：PrintStream。
常用方法：print()、println()、printf()等，这些方法用于输出格式化或非格式化的数据到控制台。它主要用于显示程序的输出结果。
示例：System.out.println("Hello, World!"); 在控制台打印字符串"Hello, World!"。

System.in：

用途：标准输入流，用于从控制台接收输入。
类型：InputStream。
常用方法：由于System.in是InputStream的实例，直接使用它读取输入比较麻烦（需要处理字节），因此通常会配合Scanner类来方便地读取字符串、整数等各种类型的输入。

示例：使用Scanner读取用户输入的字符串。

Scanner scanner = new Scanner(System.in);
String input = scanner.nextLine();
System.out.println("You entered: " + input);

1
2
3

System.err：

用途：标准错误输出流，用于输出错误信息到控制台。
类型：PrintStream。
常用方法：与System.out相同，提供print()、println()、printf()等方法用于输出错误或调试信息。
特点：虽然System.err的输出也是控制台，但它通常用于输出错误信息或程序的诊断信息。在许多环境中，System.out和System.err可以被重定向到不同的目的地（比如文件），这样就可以将正常输出和错误输出分开处理。

查看System类中这三个常量对象的声明：

public final static InputStream in = null;
public final static PrintStream out = null;
public final static PrintStream err = null;

1
2
3

奇怪的是，

这三个常量对象有final声明，但是却初始化为null。final声明的常量一旦赋值就不能修改，那么null不会空指针异常吗？
这三个常量对象为什么要小写？final声明的常量按照命名规范不是应该大写吗？
这三个常量的对象有set方法？final声明的常量不是不能修改值吗？set方法是如何修改它们的值的？

final声明的常量，表示在Java的语法体系中它们的值是不能修改的，而这三个常量对象的值是由C/C++等系统函数进行初始化和修改值的，所以它们故意没有用大写，也有set方法。

   //  System源码
   public static void setOut(PrintStream out) {
        checkIO();
        setOut0(out);
    }
    public static void setErr(PrintStream err) {
        checkIO();
        setErr0(err);
    }
    public static void setIn(InputStream in) {
        checkIO();
        setIn0(in);
    }
    private static void checkIO() {
        SecurityManager sm = getSecurityManager();
        if (sm != null) {
            sm.checkPermission(new RuntimePermission("setIO"));
        }
    }
    private static native void setIn0(InputStream in);
    private static native void setOut0(PrintStream out);
    private static native void setErr0(PrintStream err);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22

编辑此页

上次更新: 2024/12/28, 18:32:08

← 节点流和处理流的区别

IO流 其他内容