String（JDK1.8）源码阅读记录

2018-09-05 642

String

在 Java 中字符串属于对象。
Java 提供了 String 类来创建和操作字符串。

定义

使用了final ，说明该类不能被继承。同时还实现了:

java.io.Serializable
Comparable
CharSequence

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence { }

属性


 /** The value is used for character storage. 
  * String就是用char[]实现的。保存的  
  */
 private final char value[];

 /** Cache the hash code for the string 
 * hash 值
 */
 private int hash; // Default to 0

 /** use serialVersionUID from JDK 1.0.2 for interoperability 
 * Java的序列化机制是通过在运行时判断类的serialVersionUID来验证版本一致性的。
 */
 private static final long serialVersionUID = -6849794470754667710L;

 /**
  * Class String is special cased within the Serialization Stream Protocol.
  * 类字符串在序列化流协议中是特殊的。
  * A String instance is written into an ObjectOutputStream according to
  * 将字符串实例写入ObjectOutputStream中，根据 a标签
  * <a href="{@docRoot}/../platform/serialization/spec/output.html">
  * Object Serialization Specification, Section 6.2, "Stream Elements"</a>
  */
 private static final ObjectStreamField[] serialPersistentFields =
     new ObjectStreamField[0];

构造方法

String 的构造方法大概有十几种，其中最常用的如下：

/**
 * 根据字符串创建字符串对象
 * Initializes a newly created {@code String} object so that it represents
 * the same sequence of characters as the argument; in other words, the
 * newly created string is a copy of the argument string. Unless an
 * explicit copy of {@code original} is needed, use of this constructor is
 * unnecessary since Strings are immutable.
 *
 * @param  original
 *         A {@code String}
 */
public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

/**
 * 根据byte数组创建字符串对象
 * byte[] to String 是根据系统的编码来的，但是也可以自己指定编码
 * Constructs a new {@code String} by decoding the specified array of bytes
 * using the platform's default charset.  The length of the new {@code
 * String} is a function of the charset, and hence may not be equal to the
 * length of the byte array.
 *
 * <p> The behavior of this constructor when the given bytes are not valid
 * in the default charset is unspecified.  The {@link
 * java.nio.charset.CharsetDecoder} class should be used when more control
 * over the decoding process is required.
 *
 * @param  bytes The bytes to be decoded into characters
 * @since  JDK1.1
 */
public String(byte bytes[]) {
    this(bytes, 0, bytes.length);
}

/**
 * 在Java中，String实例中保存有一个char[]字符数组，char[]字符数组是以unicode码来存储的，
 * String 和 char 为内存形式，byte是网络传输或存储的序列化形式。 
 * 所以在很多传输和存储的过程中需要将byte[]数组和String进行相互转化。 
 * 所以，String提供了一系列重载的构造方法来将一个字符数组转化成String， 
 * 提到byte[]和String之间的相互转换就不得不关注编码问题。  
 * 例如：
 * public String(byte bytes[], int offset, int length, Charset charset) {} 
 * String(byte bytes[], String charsetName)
 * String(byte bytes[], int offset, int length, String charsetName) 
 * and so on
 * String(byte[] bytes, Charset charset)是指通过charset来解码指定的byte数组， 
 * 将其解码成unicode的char[]数组，够造成新的String。 
 * 
 * 下面这个构造方法可以指定字节数组的编码  
 * Constructs a new {@code String} by decoding the specified array of
 * bytes using the specified {@linkplain java.nio.charset.Charset charset}.
 * The length of the new {@code String} is a function of the charset, and
 * hence may not be equal to the length of the byte array.
 *
 * <p> This method always replaces malformed-input and unmappable-character
 * sequences with this charset's default replacement string.  The {@link
 * java.nio.charset.CharsetDecoder} class should be used when more control
 * over the decoding process is required.
 *
 * @param  bytes
 *         The bytes to be decoded into characters
 *
 * @param  charset
 *         The {@linkplain java.nio.charset.Charset charset} to be used to
 *         decode the {@code bytes}
 *
 * @since  1.6
 */
 public String(byte bytes[], Charset charset) {
     this(bytes, 0, bytes.length, charset);
 }

/**
 * 根据char数组
 * Allocates a new {@code String} so that it represents the sequence of
 * characters currently contained in the character array argument. The
 * contents of the character array are copied; subsequent modification of
 * the character array does not affect the newly created string.
 *
 * @param  value
 *         The initial value of the string
 */
public String(char value[]) {
    this.value = Arrays.copyOf(value, value.length);

/**
 * 根据 StringBuffer 创建 String对象
 * Allocates a new string that contains the sequence of characters
 * currently contained in the string buffer argument. The contents of the
 * string buffer are copied; subsequent modification of the string buffer
 * does not affect the newly created string.
 *
 * @param  buffer
 *         A {@code StringBuffer}
 */
public String(StringBuffer buffer) {
    synchronized(buffer) {
        this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
    }
}

/**
 * 根据 StringBuilder 创建 String对象
 * Allocates a new string that contains the sequence of characters
 * currently contained in the string builder argument. The contents of the
 * string builder are copied; subsequent modification of the string builder
 * does not affect the newly created string.
 *
 * <p> This constructor is provided to ease migration to {@code
 * StringBuilder}. Obtaining a string from a string builder via the {@code
 * toString} method is likely to run faster and is generally preferred.
 *
 * @param   builder
 *          A {@code StringBuilder}
 *
 * @since  1.5
 */
public String(StringBuilder builder) {
    this.value = Arrays.copyOf(builder.getValue(), builder.length());
}

 /*
  * 这是一个受保护构造方法，因为不能继承，所以内部使用
  * 第二个属性基本没有用，只能是true
  * 从代码中可以看出来是直接引用，而不是新建一个，为了提高性能，节省内存等。
  * 保护的原因也是为了保证字符串不可修改。
  * Package private constructor which shares value array for speed.
  * this constructor is always expected to be called with share==true.
  * a separate constructor is needed because we already have a public
  * String(char[]) constructor that makes a copy of the given char[].
  */
  String(char[] value, boolean share) {
      // assert share : "unshared not supported";
      this.value = value;
  }

常用的方法

getByte

/**
 * 将字符串转成可用的 byte数组
 * 在通信的比较多，例如  网络中传输、8583报文、socket通信 
 * 要想不乱码，就得搞清楚通信双方所使用的字节编码！！！
 * Encodes this {@code String} into a sequence of bytes using the named
 * charset, storing the result into a new byte array.
 *
 * <p> The behavior of this method when this string cannot be encoded in
 * the given charset is unspecified.  The {@link
 * java.nio.charset.CharsetEncoder} class should be used when more control
 * over the encoding process is required.
 *
 * @param  charsetName
 *         The name of a supported {@linkplain java.nio.charset.Charset
 *         charset}
 *
 * @return  The resultant byte array
 *
 * @throws  UnsupportedEncodingException
 *          If the named charset is not supported
 *
 * @since  JDK1.1
 */
public byte[] getBytes(String charsetName)
        throws UnsupportedEncodingException {
    if (charsetName == null) throw new NullPointerException();
    return StringCoding.encode(charsetName, value, 0, value.length);
}

/**
 * 同上
 * Encodes this {@code String} into a sequence of bytes using the given
 * {@linkplain java.nio.charset.Charset charset}, storing the result into a
 * new byte array.
 *
 * <p> This method always replaces malformed-input and unmappable-character
 * sequences with this charset's default replacement byte array.  The
 * {@link java.nio.charset.CharsetEncoder} class should be used when more
 * control over the encoding process is required.
 *
 * @param  charset
 *         The {@linkplain java.nio.charset.Charset} to be used to encode
 *         the {@code String}
 *
 * @return  The resultant byte array
 *
 * @since  1.6
 */
public byte[] getBytes(Charset charset) {
    if (charset == null) throw new NullPointerException();
    return StringCoding.encode(charset, value, 0, value.length);
}
/**
 * 将使用系统默认编码。
 * 要注意的，部署的时候容易出错的地方就是这里，
 * windows 环境和linux环境字节编码不一样.所以建议指定编码方法
 * Encodes this {@code String} into a sequence of bytes using the
 * platform's default charset, storing the result into a new byte array.
 *
 * <p> The behavior of this method when this string cannot be encoded in
 * the default charset is unspecified.  The {@link
 * java.nio.charset.CharsetEncoder} class should be used when more control
 * over the encoding process is required.
 *
 * @return  The resultant byte array
 *
 * @since      JDK1.1
 */
public byte[] getBytes() {
    return StringCoding.encode(value, 0, value.length);
}

hashCode

/**
 * hash算法
 * hashCode可以保证相同的字符串的hash值肯定相同，
 * 但是，hash值相同并不一定是value值就相同。 
 * 所以要保证两个字符串相等还得用上 equals
 * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
 */
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

equals

/**
  * 
  * 在hashmap中
  * 一定要重写 equals 和 hachcode 
  * 才能保证是同一个字符串
  * 正因为String 重写了我们才能愉快的使用字符串作为key
  */
 public boolean equals(Object anObject) {
     /** 首先判断是不是自己！*/
     if (this == anObject) {
         return true;
     }
     /** 在判断是不是String类型 */
     if (anObject instanceof String) {
         String anotherString = (String)anObject;
         int n = value.length;
         /** 判断长度 */
         if (n == anotherString.value.length) {
             char v1[] = value;
             char v2[] = anotherString.value;
             int i = 0;
             /** 判断字节 */
             while (n-- != 0) {
                 if (v1[i] != v2[i])
                     return false;
                 i++;
             }
             return true;
         }
     }
     return false;
 }

substring

这个方法在JDK1.6（含1.6）以前和JDK1.7之后（含1.7）有了不一样的变化

JDK1.6 substring

/** 
 * 仍然创建新的字符串但是 旧字符串还在 只是新的引用了旧的一部分
 * 但旧字符串很大的时候，因为新的引用一小部分而无法回收会导致内存泄漏
 * 一般使用加上一个空的字符串来生成新的解决这个问题
 * str = str.substring(x, y) + ""
 */
String(int offset, int count, char value[]) {
    this.value = value;
    this.offset = offset;
    this.count = count;
}

public String substring(int beginIndex, int endIndex) {
    /** 校验数组溢出 */
    return  new String(offset + beginIndex, endIndex - beginIndex, value);
}

内存泄露：在计算机科学中，内存泄漏指由于疏忽或错误造成程序未能释放已经不再使用的内存。内存泄漏并非指内存在物理上的消失，而是应用程序分配某段内存后，由于设计错误，导致在释放该段内存之前就失去了对该段内存的控制，从而造成了内存的浪费。

JDK1.8 substring

jdk1.7之后直接新建了一个字符串。虽然增加了内存，但是解决了内存泄漏问题。

public String substring(int beginIndex, int endIndex) {

    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > value.length) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    int subLen = endIndex - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
            : new String(value, beginIndex, subLen);

public String(char value[], int offset, int count) {
   if (offset < 0) {
        throw new StringIndexOutOfBoundsException(offset);
    }
    if (count <= 0) {
        if (count < 0) {
            throw new StringIndexOutOfBoundsException(count);
        }
        if (offset <= value.length) {
            this.value = "".value;
            return;
        }
    }
    // Note: offset or count might be near -1>>>1.
    if (offset > value.length - count) {
        throw new StringIndexOutOfBoundsException(offset + count);
    }
    this.value = Arrays.copyOfRange(value, offset, offset+count);
}

valueOf

/** 调用对象自己的toString方法 */
public static String valueOf(Object obj) {
    return (obj == null) ? "null" : obj.toString();
}
public static String valueOf(char data[]) {
    return new String(data);
}
public static String valueOf(char data[], int offset, int count) {
    return new String(data, offset, count);
}

String + 号重载

 String str = "abc";
 String str1= str + "def";

 /** 反编译之后 */
 String str = "abc";
 String str1= (new StringBuilder(String.valueOf(str))).append("def").toString();

spilt

按照字符regex将字符串分成limit份。

 public String[] split(String regex, int limit) {
     /* fastpath if the regex is a
      (1)one-char String and this character is not one of the
         RegEx's meta characters ".$|()[{^?*+\\", or
      (2)two-char String and the first char is the backslash and
         the second is not the ascii digit or ascii letter.
      */
     char ch = 0;
     if (((regex.value.length == 1 &&
          ".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
          (regex.length() == 2 &&
           regex.charAt(0) == '\\' &&
           (((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
           ((ch-'a')|('z'-ch)) < 0 &&
           ((ch-'A')|('Z'-ch)) < 0)) &&
         (ch < Character.MIN_HIGH_SURROGATE ||
          ch > Character.MAX_LOW_SURROGATE))
     {
         int off = 0;
         int next = 0;
         boolean limited = limit > 0;
         ArrayList<String> list = new ArrayList<>();
         while ((next = indexOf(ch, off)) != -1) {
             if (!limited || list.size() < limit - 1) {
                 list.add(substring(off, next));
                 off = next + 1;
             } else {    // last one
                 //assert (list.size() == limit - 1);
                 list.add(substring(off, value.length));
                 off = value.length;
                 break;
             }
         }
         // If no match was found, return this
         if (off == 0)
             return new String[]{this};

         // Add remaining segment
         if (!limited || list.size() < limit)
             list.add(substring(off, value.length));

         // Construct result
         int resultSize = list.size();
         if (limit == 0) {
             while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
                 resultSize--;
             }
         }
         String[] result = new String[resultSize];
         return list.subList(0, resultSize).toArray(result);
     }
     return Pattern.compile(regex).split(this, limit);
 }

按照字符regex将字符串分割

 /** 直接调用 split(String regex, int limit) limit 为 零 */
 public String[] split(String regex) {
    return split(regex, 0);
  }

equalsIgnoreCase

public boolean equalsIgnoreCase(String anotherString) {
        return (this == anotherString) ? true
                : (anotherString != null)
                && (anotherString.value.length == value.length)
                && regionMatches(true, 0, anotherString, 0, value.length);
    }

三目运算符加 && 代替多个if

replaceFirst、replaceAll、replace

String replaceFirst(String regex, String replacement)
String replaceAll(String regex, String replacement)
String replace(CharSequence target, CharSequence replacement)

replace的参数是char和CharSequence,即可以支持字符的替换,也支持字符串的替换
replaceAll和replaceFirst的参数是regex,即基于规则表达式的替换,replace只要有符合就替换
replaceFirst(),只替换第一次出现的字符串;

其他方法

String 类中还有很多方法。例如：

public int length(){}
返回字符串长度
public boolean isEmpty() { }
返回字符串是否为空
public char charAt(int index) {}
返回字符串中第（index+1）个字符
public char[] toCharArray() {}
转化成字符数组
public String trim(){}
去掉两端空格
public String toUpperCase(){}
转化为大写
public String toLowerCase(){}
转化为小写
public String concat(String str) {}
拼接字符串
public boolean matches(String regex){}
判断字符串是否匹配给定的regex正则表达式
public boolean contains(CharSequence s)
判断字符串是否包含字符序列s

微信关注我们

原文链接：https://yq.aliyun.com/articles/680125

转载内容版权归作者及来源网站所有！

低调大师中文资讯倾力打造互联网数据资讯、行业资源、电子商务、移动互联网、网络营销平台。持续更新报道IT业界、互联网、市场资讯、驱动更新,是最及时权威的产业资讯及硬件资讯报道平台。

Java笔记——Redis分布式锁解决方案

我们知道分布式锁的特性是排他、避免死锁、高可用。分布式锁的实现可以通过数据库的乐观锁(通过版本号)或者悲观锁(通过for update)、Redis的setnx()命令、Zookeeper(在某个持久节点添加临时有序节点，判断当前节点是否是序列中最小的节点，如果不是则监听比当前节点还要小的节点。如果是，获取锁成功。当被监听的节点释放了锁(也就是被删除)，会通知当前节点。然后当前节点再尝试获取锁，如此反复) 本篇文章，主要讲如何用Redis的形式实现分布式锁。后续文章会讲解热点KEY读取，缓存穿透和缓存雪崩的场景和解决方案、缓存更新策略等等知识点，理论知识点较多。 Redis配置我的redis配置如下 spring.redis.host= spring.redis.port=6379 #reids超时连接时间 spring.redis.timeout=100000 spring.redis.password= #连接池最大连接数 spring.redis.pool.max-active=10000 #连接池最大空闲数 spring.redis.pool.max-idle=1000...

2018-09-05

593

一、背景猜你喜欢是推荐领域极其经典的一个场景，在1688首页无线端猜你喜欢栏目日曝光约23w，其中约72%的用户会产生点击行为，人均点击约8次。在我们的场景中，这部分是一个相对较大的流量来源。我们算法要做的就是通过用户的真实行为数据，预测用户最可能感兴趣的商品进行展示，以提高点击率，从而提高购买量。热点热议机器如何“猜你喜欢”？深度学习模型在1688的应用实践作者：技术小能手10万粉丝升级成“战友”，你的爱豆请你出道！作者：技术小能手电商平台如何冲上“云”霄？京东云用创新思维诠释解决之道作者：技术小能手知识整理 Java程序员需要突破的技术要点作者：潘天涯Java 多线程之 wait等待线程实例作者：verejavaSEO不只是发发链，那么SEO到底是什么呢? 作者：凹凹凸曼Math对象和常用方法作者：景凌凯linux之 shell脚本作者：俄又不乱来美文回顾给妹子讲python-S01E18初探函数作用域作者：技术小能手练就Java24章真经—你所不知道的工厂方法作者：技术小能手“杀”一个程序员不需要用枪，改三次需求就可以了！作者：技术小能手令人生...

2018-09-05

695

资源下载

更多资源

腾讯云软件源

为解决软件依赖安装时官方源访问速度慢的问题，腾讯云为一些软件搭建了缓存服务。您可以通过使用腾讯云软件源站来提升依赖包的安装速度。为了方便用户自由搭建服务架构，目前腾讯云软件源站支持公网访问和内网访问。

Spring

Spring框架（Spring Framework）是由Rod Johnson于2002年提出的开源Java企业级应用框架，旨在通过使用JavaBean替代传统EJB实现方式降低企业级编程开发的复杂性。该框架基于简单性、可测试性和松耦合性设计理念，提供核心容器、应用上下文、数据访问集成等模块，支持整合Hibernate、Struts等第三方框架，其适用范围不仅限于服务器端开发，绝大多数Java应用均可从中受益。

Rocky Linux

Rocky Linux（中文名：洛基）是由Gregory Kurtzer于2020年12月发起的企业级Linux发行版，作为CentOS稳定版停止维护后与RHEL（Red Hat Enterprise Linux）完全兼容的开源替代方案，由社区拥有并管理，支持x86_64、aarch64等架构。其通过重新编译RHEL源代码提供长期稳定性，采用模块化包装和SELinux安全架构，默认包含GNOME桌面环境及XFS文件系统，支持十年生命周期更新。

WebStorm

WebStorm 是jetbrains公司旗下一款JavaScript 开发工具。目前已经被广大中国JS开发者誉为“Web前端开发神器”、“最强大的HTML5编辑器”、“最智能的JavaScript IDE”等。与IntelliJ IDEA同源，继承了IntelliJ IDEA强大的JS部分的功能。

String（JDK1.8）源码阅读记录

String

定义

属性

构造方法

常用的方法

getByte

hashCode

equals

substring

JDK1.6 substring

JDK1.8 substring

valueOf

String + 号重载

spilt

equalsIgnoreCase

replaceFirst、replaceAll、replace

其他方法

Java笔记——Redis分布式锁解决方案

9月6日云栖精选夜读 | 机器如何“猜你喜欢”？深度学习模型在1688的应用实践

相关文章

发表评论

资源下载

腾讯云软件源

Spring

Rocky Linux

WebStorm

欢迎您来访！

String（JDK1.8） 源码阅读记录

String

定义

属性

构造方法

常用的方法

getByte

hashCode

equals

substring

JDK1.6 substring

JDK1.8 substring

valueOf

String + 号重载

spilt

equalsIgnoreCase

replaceFirst、replaceAll、replace

其他方法

Java笔记——Redis分布式锁解决方案

9月6日云栖精选夜读 | 机器如何“猜你喜欢”？深度学习模型在1688的应用实践

相关文章

发表评论

资源下载

腾讯云软件源

Spring

Rocky Linux

WebStorm

欢迎您来访！

String（JDK1.8）源码阅读记录