本文共 7395 字,大约阅读时间需要 24 分钟。
很多面试题都会提到自己来实现一个函数,strcmp,strcpy,strstr……
我们来看一下一下函数的实现:int
strcmp (p1, p2) const char *p1; const char *p2;{ register const unsigned char *s1 = (const unsigned char *) p1; //register 表示向编译器建议使用高速寄存器存储变量 //const 表示指向的字符串不被修改 //unsigned 表示支持unicode字符 //char 就不用说了^_^ register const unsigned char *s2 = (const unsigned char *) p2; unsigned reg_char c1, c2;do
{ c1 = (unsigned char) *s1++; c2 = (unsigned char) *s2++; if (c1 == '/0') return c1 - c2;//比较的精华所在 } while (c1 == c2);return c1 - c2;
}早上偶想了一上午看了这个代码才知道差距所在,do while 语句莫有想到 c1-c2 也莫有想到.下面来看个稍微复杂点的,strstr,偶们公司面试经常出这道题,不少所谓的英雄好汉折戟沉沙于此,其实不最求效率的话,
两个for循环也可以搞定,最求效率就要看这个了.看完这段代码,莫有惊喜,从作者自信满满的口气中(Until someone tells me otherwise, I assume that this is the
fastest implementation of strstr() in C),我还以为至少效率会超过kmp的,看完以后,这只不过是个精巧的回溯算法罢了,多次用到的go to 也末有什么特色,不过这个人很有意思的一点是不写注释也就罢了,还要故意说出来(I deliberately chose not to comment it. You should have at least as much fun trying to understand it, as I had to write it :-).) 鄙视你 附另一个人写的,貌似加州大学的一个教授应该是solaris上的实现这个算法看上去方方正正,冲正平和,让人一看上去就觉得程序就应该是这么写的,而你自己放手去写却写不出这个味道来,所谓大巧
不工,莫过于此.同是回溯算法,我真的不认为第一个除了神叨叨的故弄玄虚了半天,效率上有多少提升.再来看看strlen,偶自信满满的写了一个先:
自我感觉相当良好,再看看人家写的,完全不懂! 我靠 偶整整落后一百年!
/* Return the length of the null-terminated string STR. Scan for
the null terminator quickly by testing four bytes at a time. */size_tstrlen (str) const char *str;{ const char *char_ptr; const unsigned long int *longword_ptr; unsigned long int longword, magic_bits, himagic, lomagic;/* Handle the first few characters by reading one character at a time.
Do this until CHAR_PTR is aligned on a longword boundary. */ for (char_ptr = str; ((unsigned long int) char_ptr & (sizeof (longword) - 1)) != 0; ++char_ptr) if (*char_ptr == '/0') return char_ptr - str;/* All these elucidatory comments refer to 4-byte longwords,
but the theory applies equally well to 8-byte longwords. */longword_ptr = (unsigned long int *) char_ptr;
/* Bits 31, 24, 16, and 8 of this number are zero. Call these bits
the "holes." Note that there is a hole just to the left of each byte, with an extra at the end:bits: 01111110 11111110 11111110 11111111
bytes: AAAAAAAA BBBBBBBB CCCCCCCC DDDDDDDDThe 1-bits make sure that carries propagate to the next 0-bit.
The 0-bits provide holes for carries to fall into. */ magic_bits = 0x7efefeffL; himagic = 0x80808080L; lomagic = 0x01010101L; if (sizeof (longword) > 4) { /* 64-bit version of the magic. */ /* Do the shift in two steps to avoid a warning if long has 32 bits. */ magic_bits = ((0x7efefefeL << 16) << 16) | 0xfefefeffL; himagic = ((himagic << 16) << 16) | himagic; lomagic = ((lomagic << 16) << 16) | lomagic; } if (sizeof (longword) > 8) abort ();/* Instead of the traditional loop which tests each character,
we will test a longword at a time. The tricky part is testing if *any of the four* bytes in the longword in question are zero. */ for (;;) { /* We tentatively exit the loop if adding MAGIC_BITS to LONGWORD fails to change any of the hole bits of LONGWORD.1) Is this safe? Will it catch all the zero bytes?
Suppose there is a byte with all zeros. Any carry bits propagating from its left will fall into the hole at its least significant bit and stop. Since there will be no carry from its most significant bit, the LSB of the byte to the left will be unchanged, and the zero will be detected.2) Is this worthwhile? Will it ignore everything except
zero bytes? Suppose every byte of LONGWORD has a bit set somewhere. There will be a carry into bit 8. If bit 8 is set, this will carry into bit 16. If bit 8 is clear, one of bits 9-15 must be set, so there will be a carry into bit 16. Similarly, there will be a carry into bit 24. If one of bits 24-30 is set, there will be a carry into bit 31, so all of the hole bits will be changed.The one misfire occurs when bits 24-30 are clear and bit
31 is set; in this case, the hole at bit 31 is not changed. If we had access to the processor carry flag, we could close this loophole by putting the fourth hole at bit 32!So it ignores everything except 128's, when they're aligned
properly. */longword = *longword_ptr++;
if (
#if 0 /* Add MAGIC_BITS to LONGWORD. */ (((longword + magic_bits)/* Set those bits that were unchanged by the addition. */
^ ~longword)/* Look at only the hole bits. If any of the hole bits
are unchanged, most likely one of the bytes was a zero. */ & ~magic_bits)#else ((longword - lomagic) & himagic)#endif != 0) { /* Which of the bytes was the zero? If none of them were, it was a misfire; continue the search. */const char *cp = (const char *) (longword_ptr - 1);
if (cp[0] == 0)
return cp - str; if (cp[1] == 0) return cp - str + 1; if (cp[2] == 0) return cp - str + 2; if (cp[3] == 0) return cp - str + 3; if (sizeof (longword) > 4) { if (cp[4] == 0) return cp - str + 4; if (cp[5] == 0) return cp - str + 5; if (cp[6] == 0) return cp - str + 6; if (cp[7] == 0) return cp - str + 7; } } }}转载地址:http://tskmi.baihongyu.com/