🌴 👩🏾‍🤝‍👨🏼 🏁 Petites subtilités de java.lang.String 🍊 ❌ 🏅

Salutations

En parcourant le matériel accumulé, java.lang.Stringj'ai décidé de faire une petite sélection d'exemples d'utilisation efficace (et pas si).

Toute conversion de ligne génère une nouvelle ligne

C'est l'un des principaux mythes sur les lignes. En fait, ce n'est pas toujours le cas. Supposons que nous ayons une chaîne contenant uniquement des lettres minuscules:

var str = "str";

Maintenant, ce code

jshell> var str = "str";
jshell> System.out.println(str.toLowerCase() == str);

affichera

true

En d'autres termes, ici l'appel a toLowerCase()renvoyé la ligne sur laquelle il a été appelé. Et bien que ce comportement ne soit pas décrit dans la documentation, le code StringLatin1.toLowerCase()ne laisse aucun doute (voici et ci-dessous le code de https://hg.openjdk.java.net/jdk/jdk/ ):

public static String toLowerCase(String str, byte[] value, Locale locale) {
  if (locale == null) {
    throw new NullPointerException();
  }
  int first;
  final int len = value.length;
  // Now check if there are any characters that need to be changed
  for (first = 0 ; first < len; first++) {
    int cp = value[first] & 0xff;
    // no need to check Character.ERROR
    if (cp != CharacterDataLatin1.instance.toLowerCase(cp)) {
      break;
    }
  }
  if (first == len)
    return str;     // <--   this
  //...
}

: , . , , , String.trim() String.strip():

//  :    strip()
//  trim()    this

/**
 *
 * @return  a string whose value is this string, with all leading
 *          and trailing space removed, or this string if it
 *          has no leading or trailing space.
 */
public String trim() {
  String ret = isLatin1() ? StringLatin1.trim(value)
                          : StringUTF16.trim(value);
  return ret == null ? this : ret;
}

/**
 * @return  a string whose value is this string, with all leading
 *          and trailing white space removed
 *
 * @see Character#isWhitespace(int)
 *
 * @since 11
 */
public String strip() {
  String ret = isLatin1() ? StringLatin1.strip(value)
                          : StringUTF16.strip(value);
  return ret == null ? this : ret;
}

boolean isUpperCase = name.toUpperCase().equals(name);

- StringUtils, ( ""). / /, , name.toUpperCase() name, ?

boolean isUpperCase = name.toUpperCase() == name; //

, , String.toUpperCase() . ( , ) o.a.c.l.StringUtils.isAllUpperCase().

boolean eq = aString.toUpperCase().equals(anotherString);

boolean eq = aString.equalsIgnoreCase(anotherString);

, "" , "".

`String.toLowerCase()`

String.toLowerCase() / String.toUpperCase() , . :

boolean isEmpty = someStr.toLowerCase().isEmpty();

, . , / . , isEmpty() true. false, . . 1 , .
, :

boolean isEmpty = someStr.isEmpty();

. String.isEmpty() :

public boolean isEmpty() {
  return value.length == 0;
}

int len = someStr.toLowerCase().length();

int len = someStr.length();

, ?

String s = "!";

String s = "!";

, , . . — . , toLowerCase() / toUpperCase() , . , . , :

@Test
void toLowerCase() {
  String str = "\u00cc"; // Ì

  assert str.length() == 1;

  String strLowerCase = str.toLowerCase(new Locale("lt"));

  assert strLowerCase.length() == 3; // i̇̀
}

, : " ?" 1 , ( — 6 (!) ). :

/**
 * Converts all of the characters in this {@code String} to lower
 * case using the rules of the given {@code Locale}.  Case mapping is based
 * on the Unicode Standard version specified by the {@link java.lang.Character Character}
 * class. Since case mappings are not always 1:1 char mappings, the resulting
 * {@code String} may be a different length than the original {@code String}.
 */
public String toLowerCase(Locale locale) {
  //...
}

//StringLatin1

public static String toLowerCase(String str, byte[] value, Locale locale) {
  // ...
  String lang = locale.getLanguage();
  if (lang == "tr" || lang == "az" || lang == "lt") {        // !!!
    return toLowerCaseEx(str, value, first, locale, true);
  }
  //...
}

, , :)

1 — String.substring(n, n+1) — , , , 1. :

boolean startsWithUnderline = message.substring(0, 1).equals("_");

boolean startsWithUnderline = message.charAt(0) == '_';

, . :

String s = "xxx" + name.substring(n, n + 1);

String s = "xxx" + name.charAt(n);

, . . . , .

— :

boolean startsWithUrl = content.substring(index, index + 4).equals("url(");

boolean startsWithUrl = content.startsWith("url(", index);

. , ( ):

private String findPerClause(String str) {
  str = str.substring(str.indexOf('(') + 1);
  str = str.substring(0, str.length() - 1);
  return str;
}

, , :

 (  ,   )
-->
  ,

, , :

private String findPerClause(String str) {
  int beginIndex = str.indexOf('(') + 1;
  int endIndex = str.length() - 1;
  return str.substring(beginIndex, endIndex);
}

, :

int idx = path.substring(2).indexOf('/');

, String.indexOf(int ch, int fromIndex), :

int idx = path.indexOf('/', 2);

. , '/' 2, . . :

int idx = name.indexOf('/', 2);
if (pos != -1)  {
  idx -= 2;
}

, .

JDK. ,

someStr.substring(n, n);

, n :

// String

public String substring(int beginIndex, int endIndex) {
  int length = length();
  checkBoundsBeginEnd(beginIndex, endIndex, length);
  int subLen = endIndex - beginIndex;
  if (beginIndex == 0 && endIndex == length) {
    return this;
  }
  return isLatin1() ? StringLatin1.newString(value, beginIndex, subLen)
                    : StringUTF16.newString(value, beginIndex, subLen);
}

// StringLatin1

public static String newString(byte[] val, int index, int len) {
  return new String(Arrays.copyOfRange(val, index, index + len), LATIN1);
}

beginIndex endIndex subLen 0, StringLatin1.newString() . , :

// StringLatin1

public static String newString(byte[] val, int index, int len) {
  if (len == 0) {
      return "";
  }
  return new String(Arrays.copyOfRange(val, index, index + len), LATIN1);
}

StringLatin1.stripLeading() / stripTrailing() StringUTF16. .

, :

//  StringLatin1  
public static String stripLeading(byte[] value) {
  int left = indexOfNonWhitespace(value);
  if (left == value.length) {
    return "";
  }
  return (left != 0) ? newString(value, left, value.length - left) : null;
}

value.length == 0 . left == value.length newString,

public static String stripLeading(byte[] value) {
  int left = indexOfNonWhitespace(value);
  return (left != 0) ? newString(value, left, value.length - left) : null;
}

null! String.stripLeading() , this, . , . :

// 
boolean b= new String("").stripLeading() == ""; // true

//  
boolean b= new String("").stripLeading() == ""; // false !

, ?

, :)

Du point de vue de la compatibilité, je pense que cela devrait être correct, car
l'identité de la chaîne vide retournée n'est pas spécifiée.

https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-F February/064957.html

, :

@Warmup(iterations = 10, time = 1)
@Measurement(iterations = 10, time = 1)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3, jvmArgsAppend = {"-Xms4g", "-Xmx4g", "-XX:+UseParallelGC"})
public class SubstringBenchmark {
    private static final String str = "Tolstoy";

    @Benchmark
    public String substring() {
        return str.substring(1, 1);
    }
}



                                            Mode    Score    Error   Units
substring                                   avgt      5.8 ±  0.066   ns/op
substring:·gc.alloc.rate                    avgt   4325.9 ± 47.259  MB/sec
substring:·gc.alloc.rate.norm               avgt     40.0 ±  0.001    B/op
substring:·gc.churn.G1_Eden_Space           avgt   4338.8 ± 86.555  MB/sec
substring:·gc.churn.G1_Eden_Space.norm      avgt     40.1 ±  0.647    B/op
substring:·gc.churn.G1_Survivor_Space       avgt      0.0 ±  0.003  MB/sec
substring:·gc.churn.G1_Survivor_Space.norm  avgt   ≈ 10⁻⁴             B/op
substring:·gc.count                         avgt    557.0           counts
substring:·gc.time                          avgt    387.0               ms



substring                                   avgt      2.4 ±  0.172   ns/op
substring:·gc.alloc.rate                    avgt      0.0 ±  0.001  MB/sec
substring:·gc.alloc.rate.norm               avgt   ≈ 10⁻⁵             B/op
substring:·gc.count                         avgt      ≈ 0           counts

, String.substring(n, n) , .

, , , , . , AnnotationMetadataReadingVisitor-:

MultiValueMap<String, Object> getAllAnnotationAttributes(String annotationName, boolean classValAsStr) {
  // ...
  String annotatedElement = "class '" + getClassName() + "'";
  for (AnnotationAttributes raw : attributes) {
    for (Map.Entry<String, Object> entry : convertClassValues(
      "class '" + getClassName() + "'", classLoader, raw, classValAsStr).entrySet()) {
      allAttributes.add(entry.getKey(), entry.getValue());
    }
  }
  return allAttributes;
}

L’expression "class '" + getClassName() + "'"sera la même et nous ne voulons pas vraiment créer la même ligne dans une double boucle, il est donc préférable de la créer 1 fois en dehors de la boucle. Plus tôt, attraper de tels exemples était une question de chance: j'ai trouvé que celui-ci avait échoué avec succès dans la source lors du débogage de mon application. Maintenant, grâce à IDEA-230889, cela peut être automatisé. Bien sûr, c'est loin d'être toujours la création d'une nouvelle ligne dans une boucle, quel que soit le passage, mais même dans ces cas, on peut distinguer ceux dans lesquels il y a une partie constante durable:

// org.springframework.beans.factory.support.BeanDefinitionReaderUtils

public static String uniqueBeanName(String beanName, BeanDefinitionRegistry registry) {
  String id = beanName;
  int counter = -1;

  // Increase counter until the id is unique.
  while (counter == -1 || registry.containsBeanDefinition(id)) {
    counter++;
    id = beanName + GENERATED_BEAN_NAME_SEPARATOR + counter;
  }
  return id;
}

Ici, le préfixe est beanName + GENERATED_BEAN_NAME_SEPARATORtoujours le même, il peut donc être mis en évidence.

C'est tout, écrivez vos exemples dans les commentaires - nous le couvrirons.

Petites subtilités de java.lang.String

Toute conversion de ligne génère une nouvelle ligne

String.toLowerCase()

More articles:

`String.toLowerCase()`