検索結果

String.prototype.charAt()

この記事内

概要
構文
1. 引数
説明
例
関連情報

この翻訳は不完全です。英語からこの記事を翻訳してください。

概要

文字列から指定位置の文字を返します。

構文

string.charAt(index)

引数

index: 0 から文字列の長さより 1 小さい整数までの間の整数

説明

文字列において、文字は左から右の方向にインデックス化されます。一番最初の文字のインデックスは 0 で、一番最後の文字のインデックスは、stringName が文字列オブジェクトの場合、stringName.length - 1 です。指定する index が範囲外の場合、JavaScript は空文字列を返します。エラーは発生しません。

例

例: 文字列内の、異なる位置の文字を表示する

"Brave new world" という文字列内の、異なる位置の文字を表示する例を以下に示します。

var anyString = "Brave new world";

document.writeln( "インデックス   0 の文字 : 「"   +   anyString.charAt(0) + "」" );
document.writeln( "インデックス   1 の文字 : 「"   +   anyString.charAt(1) + "」" );
document.writeln( "インデックス   2 の文字 : 「"   +   anyString.charAt(2) + "」" );
document.writeln( "インデックス   3 の文字 : 「"   +   anyString.charAt(3) + "」" );
document.writeln( "インデックス   4 の文字 : 「"   +   anyString.charAt(4) + "」" );
document.writeln( "インデックス 999 の文字 : 「"   + anyString.charAt(999) + "」" );

上記コードの出力を以下に示します。

インデックス   0 の文字 : 「B」
インデックス   1 の文字 : 「r」
インデックス   2 の文字 : 「a」
インデックス   3 の文字 : 「v」
インデックス   4 の文字 : 「e」
インデックス 999 の文字 : 「」

Example 2: Getting whole characters

The following provides a means of ensuring that going through a string loop always provides a whole character, even if the string contains characters that are not in the Basic Multi-lingual Plane.

var str = 'A \uD87E\uDC04 Z'; // We could also use a non-BMP character directly
for (var i=0, chr; i < str.length; i++) {
  if ((chr = getWholeChar(str, i)) === false) {
    continue;
  } // Adapt this line at the top of each loop, passing in the whole string and the current iteration and returning a variable to represent the individual character

  alert(chr);
}

function getWholeChar (str, i) {
  var code = str.charCodeAt(i);     
 
  if (isNaN(code)) {
    return ''; // Position not found
  }
  if (code < 0xD800 || code > 0xDFFF) {
    return str.charAt(i);
  }
  if (0xD800 <= code && code <= 0xDBFF) { // High surrogate (could change last hex to 0xDB7F to treat high private surrogates as single characters)
    if (str.length <= (i+1))  {
      throw 'High surrogate without following low surrogate';
    }
    var next = str.charCodeAt(i+1);
      if (0xDC00 > next || next > 0xDFFF) {
        throw 'High surrogate without following low surrogate';
      }
      return str.charAt(i)+str.charAt(i+1);
  }
  // Low surrogate (0xDC00 <= code && code <= 0xDFFF)
  if (i === 0) {
    throw 'Low surrogate without preceding high surrogate';
  }
  var prev = str.charCodeAt(i-1);
  if (0xD800 > prev || prev > 0xDBFF) { // (could change last hex to 0xDB7F to treat high private surrogates as single characters)
    throw 'Low surrogate without preceding high surrogate';
  }
  return false; // We can pass over low surrogates now as the second component in a pair which we have already processed
}

In an exclusive JavaScript 1.7+ environment (such as Firefox) which allows destructured assignment, the following is a more succinct and somewhat more flexible alternative in that it does incrementing for an incrementing variable automatically (if the character warrants it in being a surrogate pair).

var str = 'A\uD87E\uDC04Z'; // We could also use a non-BMP character directly
for (var i = 0, chr; i < str.length; i++) {
  [chr, i] = getWholeCharAndI(str, i);
  // Adapt this line at the top of each loop, passing in the whole string and the current iteration and returning an array with the individual character and 'i' value (only changed if a surrogate pair)

  alert(chr);
}

function getWholeCharAndI (str, i) {
  var code = str.charCodeAt(i);

  if (isNaN(code)) {
    return ''; // Position not found
  }
  if (code < 0xD800 || code > 0xDFFF) {
    return [str.charAt(i), i]; // Normal character, keeping 'i' the same
  }
  if (0xD800 <= code && code <= 0xDBFF) { // High surrogate (could change last hex to 0xDB7F to treat high private surrogates as single characters)
    if (str.length <= (i+1))  {
      throw 'High surrogate without following low surrogate';
    }
    var next = str.charCodeAt(i+1);
      if (0xDC00 > next || next > 0xDFFF) {
        throw 'High surrogate without following low surrogate';
      }
      return [str.charAt(i)+str.charAt(i+1), i+1];
  }
  // Low surrogate (0xDC00 <= code && code <= 0xDFFF)
  if (i === 0) {
    throw 'Low surrogate without preceding high surrogate';
  }
  var prev = str.charCodeAt(i-1);
  if (0xD800 > prev || prev > 0xDBFF) { // (could change last hex to 0xDB7F to treat high private surrogates as single characters)
    throw 'Low surrogate without preceding high surrogate';
  }
  return [str.charAt(i+1), i+1]; // Return the next character instead (and increment)
}

Example 3: Fixing charAt to support non-Basic-Multilingual-Plane (BMP) characters

While example 2 may be more frequently useful for those wishing to support non-BMP characters (since the above does not require the caller to know where any non-BMP character might appear), in the event that one does wish, in choosing a character by index, to treat the surrogate pairs within a string as the single characters they represent, one can use the following:

function fixedCharAt (str, idx) {
  var ret = '';
  str += '';
  var end = str.length;

  var surrogatePairs = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g;
  while ((surrogatePairs.exec(str)) != null) {
    var li = surrogatePairs.lastIndex;
    if (li - 2 < idx) {
      idx++;
    } else {
      break;
    }
  }

  if (idx >= end || idx < 0) {
    return '';
  }

  ret += str.charAt(idx);

  if (/[\uD800-\uDBFF]/.test(ret) && /[\uDC00-\uDFFF]/.test(str.charAt(idx+1))) {
    ret += str.charAt(idx+1); // Go one further, since one of the "characters" is part of a surrogate pair
  }
  return ret;
}

ドキュメントのタグと貢献者

タグ:

このページの貢献者: teoli, ethertank, Potappo, Mgjbot

最終更新者: teoli, 2015/04/14 1:46:57

String.prototype.charAt()

概要

構文

引数

説明

例

例: 文字列内の、異なる位置の文字を表示する

Example 2: Getting whole characters

Example 3: Fixing charAt to support non-Basic-Multilingual-Plane (BMP) characters

関連情報

ドキュメントのタグと貢献者