Sedikit kehalusan java.lang.String

Salam pembuka


Melihat materi yang terakumulasi, java.lang.Stringsaya memutuskan untuk membuat sedikit pilihan contoh dari penggunaan yang efektif (dan tidak demikian).


Konversi baris apa pun akan menghasilkan baris baru


Ini adalah salah satu mitos utama tentang garis. Sebenarnya, ini tidak selalu terjadi. Misalkan kita memiliki string yang hanya berisi huruf kecil:


var str = "str";

Sekarang kode ini


jshell> var str = "str";
jshell> System.out.println(str.toLowerCase() == str);

akan menampilkan


true

Dengan kata lain, di sini panggilan toLowerCase()mengembalikan garis yang dipanggil. Dan meskipun perilaku ini tidak dijelaskan dalam dokumentasi, kode StringLatin1.toLowerCase()tidak meninggalkan keraguan (di sini dan di bawah ini adalah kode dari https://hg.openjdk.java.net/jdk/jdk/ ):


public static String toLowerCase(String str, byte[] value, Locale locale) {
  if (locale == null) {
    throw new NullPointerException();
  }
  int first;
  final int len = value.length;
  // Now check if there are any characters that need to be changed
  for (first = 0 ; first < len; first++) {
    int cp = value[first] & 0xff;
    // no need to check Character.ERROR
    if (cp != CharacterDataLatin1.instance.toLowerCase(cp)) {
      break;
    }
  }
  if (first == len)
    return str;     // <--   this
  //...
}

: , . , , , String.trim() String.strip():


//  :    strip()
//  trim()    this

/**
 *
 * @return  a string whose value is this string, with all leading
 *          and trailing space removed, or this string if it
 *          has no leading or trailing space.
 */
public String trim() {
  String ret = isLatin1() ? StringLatin1.trim(value)
                          : StringUTF16.trim(value);
  return ret == null ? this : ret;
}

/**
 * @return  a string whose value is this string, with all leading
 *          and trailing white space removed
 *
 * @see Character#isWhitespace(int)
 *
 * @since 11
 */
public String strip() {
  String ret = isLatin1() ? StringLatin1.strip(value)
                          : StringUTF16.strip(value);
  return ret == null ? this : ret;
}

:


boolean isUpperCase = name.toUpperCase().equals(name);

- StringUtils, ( ""). / /, , name.toUpperCase() name, ?


boolean isUpperCase = name.toUpperCase() == name; // 

, , String.toUpperCase() . ( , ) o.a.c.l.StringUtils.isAllUpperCase().



boolean eq = aString.toUpperCase().equals(anotherString);


boolean eq = aString.equalsIgnoreCase(anotherString);

, "" , "".


String.toLowerCase()


String.toLowerCase() / String.toUpperCase() , . :


boolean isEmpty = someStr.toLowerCase().isEmpty();

, . , / . , isEmpty() true. false, . . 1 , .
, :


boolean isEmpty = someStr.isEmpty();

. String.isEmpty() :


public boolean isEmpty() {
  return value.length == 0;
}


int len = someStr.toLowerCase().length();


int len = someStr.length();

, ?


String s = "!";


String s = "!";

, , . . — . , toLowerCase() / toUpperCase() , . , . , :


@Test
void toLowerCase() {
  String str = "\u00cc"; // Ì

  assert str.length() == 1;

  String strLowerCase = str.toLowerCase(new Locale("lt"));

  assert strLowerCase.length() == 3; // i̇̀
}

, : " ?" 1 , ( — 6 (!) ). :


/**
 * Converts all of the characters in this {@code String} to lower
 * case using the rules of the given {@code Locale}.  Case mapping is based
 * on the Unicode Standard version specified by the {@link java.lang.Character Character}
 * class. Since case mappings are not always 1:1 char mappings, the resulting
 * {@code String} may be a different length than the original {@code String}.
 */
public String toLowerCase(Locale locale) {
  //...
}

:


//StringLatin1

public static String toLowerCase(String str, byte[] value, Locale locale) {
  // ...
  String lang = locale.getLanguage();
  if (lang == "tr" || lang == "az" || lang == "lt") {        // !!!
    return toLowerCaseEx(str, value, first, locale, true);
  }
  //...
}

, , :)



1 — String.substring(n, n+1) — , , , 1. :


boolean startsWithUnderline = message.substring(0, 1).equals("_");


boolean startsWithUnderline = message.charAt(0) == '_';

, . :


String s = "xxx" + name.substring(n, n + 1);


String s = "xxx" + name.charAt(n);

, . . . , .


— :


boolean startsWithUrl = content.substring(index, index + 4).equals("url(");


boolean startsWithUrl = content.startsWith("url(", index);

. , ( ):


private String findPerClause(String str) {
  str = str.substring(str.indexOf('(') + 1);
  str = str.substring(0, str.length() - 1);
  return str;
}

, , :


 (  ,   )
-->
  ,   

, , :


private String findPerClause(String str) {
  int beginIndex = str.indexOf('(') + 1;
  int endIndex = str.length() - 1;
  return str.substring(beginIndex, endIndex);
}

, :


int idx = path.substring(2).indexOf('/');

, String.indexOf(int ch, int fromIndex), :


int idx = path.indexOf('/', 2);

. , '/' 2, . . :


int idx = name.indexOf('/', 2);
if (pos != -1)  {
  idx -= 2;
}

, .


JDK. ,


someStr.substring(n, n);

, n :


// String

public String substring(int beginIndex, int endIndex) {
  int length = length();
  checkBoundsBeginEnd(beginIndex, endIndex, length);
  int subLen = endIndex - beginIndex;
  if (beginIndex == 0 && endIndex == length) {
    return this;
  }
  return isLatin1() ? StringLatin1.newString(value, beginIndex, subLen)
                    : StringUTF16.newString(value, beginIndex, subLen);
}

// StringLatin1

public static String newString(byte[] val, int index, int len) {
  return new String(Arrays.copyOfRange(val, index, index + len), LATIN1);
}

beginIndex endIndex subLen 0, StringLatin1.newString() . , :


// StringLatin1

public static String newString(byte[] val, int index, int len) {
  if (len == 0) {
      return "";
  }
  return new String(Arrays.copyOfRange(val, index, index + len), LATIN1);
}

StringLatin1.stripLeading() / stripTrailing() StringUTF16. .


, :


//  StringLatin1  
public static String stripLeading(byte[] value) {
  int left = indexOfNonWhitespace(value);
  if (left == value.length) {
    return "";
  }
  return (left != 0) ? newString(value, left, value.length - left) : null;
}

value.length == 0 . left == value.length newString,


public static String stripLeading(byte[] value) {
  int left = indexOfNonWhitespace(value);
  return (left != 0) ? newString(value, left, value.length - left) : null;
}

null! String.stripLeading() , this, . , . :


// 
boolean b= new String("").stripLeading() == ""; // true

//  
boolean b= new String("").stripLeading() == ""; // false !

, ?


, :)
From a compatibility point of view I think this should be fine, as
the identity of the returned empty string isn't specified.

https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-February/064957.html


!


, :


@Warmup(iterations = 10, time = 1)
@Measurement(iterations = 10, time = 1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3, jvmArgsAppend = {"-Xms4g", "-Xmx4g", "-XX:+UseParallelGC"})
public class SubstringBenchmark {
    private static final String str = "Tolstoy";

    @Benchmark
    public String substring() {
        return str.substring(1, 1);
    }
}

:




                                            Mode    Score    Error   Units
substring                                   avgt      5.8 ±  0.066   ns/op
substring:·gc.alloc.rate                    avgt   4325.9 ± 47.259  MB/sec
substring:·gc.alloc.rate.norm               avgt     40.0 ±  0.001    B/op
substring:·gc.churn.G1_Eden_Space           avgt   4338.8 ± 86.555  MB/sec
substring:·gc.churn.G1_Eden_Space.norm      avgt     40.1 ±  0.647    B/op
substring:·gc.churn.G1_Survivor_Space       avgt      0.0 ±  0.003  MB/sec
substring:·gc.churn.G1_Survivor_Space.norm  avgt   ≈ 10⁻⁴             B/op
substring:·gc.count                         avgt    557.0           counts
substring:·gc.time                          avgt    387.0               ms



substring                                   avgt      2.4 ±  0.172   ns/op
substring:·gc.alloc.rate                    avgt      0.0 ±  0.001  MB/sec
substring:·gc.alloc.rate.norm               avgt   ≈ 10⁻⁵             B/op
substring:·gc.count                         avgt      ≈ 0           counts

, String.substring(n, n) , .



, , , , . , AnnotationMetadataReadingVisitor-:


MultiValueMap<String, Object> getAllAnnotationAttributes(String annotationName, boolean classValAsStr) {
  // ...
  String annotatedElement = "class '" + getClassName() + "'";
  for (AnnotationAttributes raw : attributes) {
    for (Map.Entry<String, Object> entry : convertClassValues(
      "class '" + getClassName() + "'", classLoader, raw, classValAsStr).entrySet()) {
      allAttributes.add(entry.getKey(), entry.getValue());
    }
  }
  return allAttributes;
}

Ekspresi "class '" + getClassName() + "'"akan sama dan kami tidak ingin membuat garis yang sama dalam loop ganda, jadi lebih baik untuk membuatnya 1 kali di luar loop. Sebelumnya, menangkap contoh seperti itu adalah masalah kebetulan: Saya menemukan ini gagal di dalam sumber saat debug aplikasi saya. Sekarang berkat IDEA-230889 ini bisa otomatis. Tentu saja, itu jauh dari selalu penciptaan baris baru dalam satu lingkaran, terlepas dari bagian, tetapi bahkan dalam kasus ini, kita dapat membedakan yang ada bagian konstan yang bertahan lama:


// org.springframework.beans.factory.support.BeanDefinitionReaderUtils

public static String uniqueBeanName(String beanName, BeanDefinitionRegistry registry) {
  String id = beanName;
  int counter = -1;

  // Increase counter until the id is unique.
  while (counter == -1 || registry.containsBeanDefinition(id)) {
    counter++;
    id = beanName + GENERATED_BEAN_NAME_SEPARATOR + counter;
  }
  return id;
}

Di sini awalan beanName + GENERATED_BEAN_NAME_SEPARATORselalu sama, oleh karena itu dapat dibawa keluar.


Itu saja, tulis contoh Anda di komentar - kami akan membahasnya.


All Articles