Machine Learning (PDF)

Summary

This document discusses machine learning, covering different types of learning, including supervised, unsupervised, and reinforcement learning. It also examines various data types, illustrating the concepts with examples such as car fuel consumption and student performance.

Full Transcript

‫ﻣﺘﺮﺟﻢ ﻣﻦ ﺍﻹﻧﺠﻠﻴﺰﻳﺔ ﺇﻟﻰ ﺍﻟﻌﺮﺑﻴﺔ ‪www.onlinedoctranslator.com -‬‬ ‫ﺍﻟﺘﻌﻠﻢﺍﻵﻟﻲ‬ ‫‪1‬‬ ‫ﻛﻴﻒﻳﺘﻌﻠﻢ ﺍﻟﺒﺸﺮ؟‬ ‫ﺍﻟﺘﻌﻠﻢﺗﺤﺖ ﺇﺷﺮﺍﻑ ﺍﻟﺨﺒﺮﺍء‬ ‫ﻥ‬ ‫ﻳﺘﻌﻠﻢﺍﻟﻄﻔﻞ ﺃﺷﻴﺎء ﻣﻦ ﻭﺍﻟﺪﻳﻪ )ﻫﺬﻩ ﻫﻲ ﺍﻟﻴﺪ‪ ،‬ﺍﻟﻤﺎء‪ ،‬ﺍﻟﻄﻌﺎﻡ‪...

‫ﻣﺘﺮﺟﻢ ﻣﻦ ﺍﻹﻧﺠﻠﻴﺰﻳﺔ ﺇﻟﻰ ﺍﻟﻌﺮﺑﻴﺔ ‪www.onlinedoctranslator.com -‬‬ ‫ﺍﻟﺘﻌﻠﻢﺍﻵﻟﻲ‬ ‫‪1‬‬ ‫ﻛﻴﻒﻳﺘﻌﻠﻢ ﺍﻟﺒﺸﺮ؟‬ ‫ﺍﻟﺘﻌﻠﻢﺗﺤﺖ ﺇﺷﺮﺍﻑ ﺍﻟﺨﺒﺮﺍء‬ ‫ﻥ‬ ‫ﻳﺘﻌﻠﻢﺍﻟﻄﻔﻞ ﺃﺷﻴﺎء ﻣﻦ ﻭﺍﻟﺪﻳﻪ )ﻫﺬﻩ ﻫﻲ ﺍﻟﻴﺪ‪ ،‬ﺍﻟﻤﺎء‪ ،‬ﺍﻟﻄﻌﺎﻡ‪ ،‬ﺍﻟﺴﻤﺎء ﺯﺭﻗﺎ‬ ‫ﺃ‬ ‫ء……(‬ ‫ﺃﻋﻨﺪﻣﺎ ﻳﺒﺪﺃ ﺍﻟﻄﻔﻞ ﺑﺎﻟﺬﻫﺎﺏ ﺇﻟﻰ ﺍﻟﻤﺪﺭﺳﺔ‪ ،‬ﻳﺒﺪﺃ ﺍﻟﻤﻌﻠﻤﻮﻥ‬ ‫ﺍﻷﺑﺠﺪﻳﺎﺕ‪،‬ﺍﻷﺭﻗﺎﻡ‪ ،‬ﺍﻟﻌﻠﻮﻡ……‪.‬‬ ‫ﺃ ﻓﻲﺟﻤﻴﻊ ﻣﺮﺍﺣﻞ ﺣﻴﺎﺓ ﺍﻹﻧﺴﺎﻥ‪ ،‬ﻳﺘﻢ ﺍﻟﺘﻌﻠﻢ ﻣﻦ ﻗﺒﻞ ﺷﺨﺺ ﻟﺪﻳﻪ‬ ‫ﺍﻟﺨﺒﺮﺓﻓﻲ ﻫﺬﺍ ﺍﻟﻤﺠﺎﻝ‪.‬‬ ‫ﻥﺍﻟﺘﻌﻠﻢ ﺑﻨﺎء ﻋﻠﻰ ﺍﻟﻤﻌﺮﻓﺔ ﺍﻟﻤﻜﺘﺴﺒﺔ ﻣﻦ ﺍﻟﺨﺒﺮﺍء‬ ‫ﻳﺴﺘﻄﻴﻊﺍﻟﻄﻔﻞ ﺃﻥ ﻳﺠﻤﻊ ﻛﻞ ﺍﻷﺷﻴﺎء ﺍﻟﺘﻲ ﻟﻬﺎ ﻧﻔﺲ ﺍﻟﻠﻮﻥ ﻣﻌﺎً ﺣﺘﻰ ﻟﻮ ﻟﻢ‬ ‫ﺃ‬ ‫ﻳﻌﻠﻤﻪﻭﺍﻟﺪﺍﻩ ﺫﻟﻚ‪.‬ﻓﻬﻮ ﻗﺎﺩﺭ ﻋﻠﻰ ﺫﻟﻚ ﻷﻧﻪ ﻓﻲ ﻭﻗﺖ ﻣﺎ ﺃﺧﺒﺮﻩ ﻭﺍﻟﺪﺍﻩ ﺃﻱ‬ ‫ﺍﻷﻟﻮﺍﻥﺃﺯﺭﻕ ﻭﺃﻳﻬﺎ ﺃﺣﻤﺮ ﻭﺃﻳﻬﺎ ﺃﺧﻀﺮ‪ ،‬ﺇﻟﺦ‪.‬‬ ‫‪2‬‬ ‫ﻓﻲﻣﺮﺣﻠﺔ ﺍﻟﺪﺭﺍﺳﺎﺕ ﺍﻟﻌﻠﻴﺎ‪ ،‬ﻳﻘﻮﻡ ﺍﻟﺒﺎﺣﺜﻮﻥ ﺑﺤﻞ ﺍﻟﻤﺸﺎﻛﻞ ﺍﻟﺘﻲ ﻟﻢ‬ ‫ﺃ‬ ‫ﻳﺘﻢﺣﻠﻬﺎ ﻣﻦ ﻗﺒﻞ ﺑﻨﺎء ًﻋﻠﻰ ﻣﻌﺮﻓﺘﻬﻢ ﺍﻟﺴﺎﺑﻘﺔ‪.‬‬ ‫ﻥﺍﻟﺘﻌﻠﻢ ﺍﻟﺬﺍﺗﻲ‬ ‫ﻋﻨﺪﻣﺎﻳﺘﻌﻠﻢ ﺍﻟﻄﻔﻞ ﺍﻟﻤﺸﻲ ﻋﺒﺮ ﺍﻟﻌﻮﺍﺉﻖ‪ ،‬ﻓﺈﻧﻪ ﻳﺼﻄﺪﻡ ﺑﺎﻟﻌﻮﺍﺉﻖ‬ ‫ﺃ‬ ‫ﻭﻳﺴﻘﻂﻋﺪﺓ ﻣﺮﺍﺕ ﺣﺘﻰ ﻳﺘﻌﻠﻢ ﺃﻧﻪ ﻛﻠﻤﺎ ﻛﺎﻧﺖ ﻫﻨﺎﻙ ﻋﻘﺒﺔ‪ ،‬ﻓﺈﻧﻪ ﻳﺤﺘﺎﺝ‬ ‫ﺇﻟﻰﻋﺒﻮﺭﻫﺎ‪.‬‬ ‫ﻻﻳﺘﻢ ﺗﻌﻠﻴﻢ ﻛﻞ ﺍﻷﺷﻴﺎء ﻣﻦ ﻗﺒﻞ ﺍﻵﺧﺮﻳﻦ‪ ،‬ﻓﻬﻨﺎﻙ ﺍﻟﻜﺜﻴﺮ ﻣﻦ ﺍﻷﺷﻴﺎء ﺍﻟﺘﻲ‬ ‫ﺃ‬ ‫ﻳﺠﺐﺗﻌﻠﻤﻬﺎ ﻓﻘﻂ ﻣﻦ ﺧﻼﻝ ﺍﻟﻤﺤﺎﻭﻻﺕ ﺍﻟﺘﻲ ﻗﻤﻨﺎ ﺑﻬﺎ ﻓﻲ ﺍﻟﻤﺎﺿﻲ‪.‬ﻧﻤﻴﻞ‬ ‫ﺇﻟﻰﺗﻜﻮﻳﻦ ﻗﺎﺉﻤﺔ ﻣﺮﺍﺟﻌﺔ ﻟﻸﺷﻴﺎء ﺍﻟﺘﻲ ﻳﺠﺐ ﻋﻠﻴﻨﺎ ﺍﻟﻘﻴﺎﻡ ﺑﻬﺎ‪ ،‬ﻭﺍﻷﺷﻴﺎء ﺍﻟﺘﻲ‬ ‫ﻻﻳﺠﺐ ﻋﻠﻴﻨﺎ ﺍﻟﻘﻴﺎﻡ ﺑﻬﺎ‪ ،‬ﺑﻨﺎء ًﻋﻠﻰ ﺗﺠﺎﺭﺑﻨﺎ‪.‬‬ ‫‪3‬‬ ‫ﻣﺎﻫﻮ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ؟‬ ‫'ﻳﻘﺎﻝ ﺃﻥ ﺑﺮﻧﺎﻣﺞ ﻛﻤﺒﻴﻮﺗﺮ ﻳﺘﻌﻠﻢ ﻣﻦ ﺍﻟﺘﺠﺮﺑﺔ ‪ E‬ﻓﻴﻤﺎ ﻳﺘﻌﻠﻖ ﺑﺒﻌﺾ ﻓﺉﺎﺕ‬ ‫ﺍﻟﻤﻬﺎﻡ‪ T‬ﻭﻣﻘﻴﺎﺱ ﺍﻷﺩﺍء ‪ ،P‬ﺇﺫﺍ ﻛﺎﻥ ﺃﺩﺍءﻩ ﻓﻲ ﺍﻟﻤﻬﺎﻡ ﻓﻲ ‪ ،T‬ﻛﻤﺎ ﺗﻢ ﻗﻴﺎﺳﻪ‬ ‫ﺑﻮﺍﺳﻄﺔ‪ ،P‬ﻳﺘﺤﺴﻦ ﻣﻊ ﺍﻟﺘﺠﺮﺑﺔ ‪'.E‬‬ ‫]ﺗﻮﻡ ﻣﻴﺘﺸﻞ[‬ ‫‪4‬‬ ‫ﺃﻧﻮﺍﻉﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ‬ ‫ﻳﻤﻜﻦﺗﺼﻨﻴﻒ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ ﺇﻟﻰ ﺛﻼﺙ ﻓﺉﺎﺕ ﻋﺮﻳﻀﺔ‪:‬‬ ‫ﻥ‬ ‫ﺍﻟﺘﻲﻫﻲ ‪ms‬‬ ‫ﻥ‬ ‫ﻣﻜﻮﻥ‬ ‫ﺧﻮﺍﺭﺯﻣﻴﺎﺕﺍﻟﺘﻌﻠﻢ ﻏﻴﺮ ﺍﻟﺨﺎﺿﻌﺔ ﻟﻺﺷﺮﺍﻑﻫﻲ ﺧﻮﺍﺭﺯﻣﻴﺎﺕ ﻳﺘﻢ‬ ‫ﻥ‬ ‫ﺗﺪﺭﻳﺒﻬﺎﻋﻠﻰ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺑﺪﻭﻥ ﺗﺴﻤﻴﺎﺕ‪ ،‬ﻭﺍﻟﻬﺪﻑ ﻫﻮ ﺍﻟﻌﺜﻮﺭ ﻋﻠﻰ‬ ‫ﺍﻟﻌﻼﻗﺎﺕﻓﻲ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫‪5‬‬ ‫ﺧﻮﺍﺭﺯﻣﻴﺎﺕﺍﻟﺘﻌﻠﻢ ﺍﻟﺘﻌﺰﻳﺰﻱﻫﻲ ﺧﻮﺍﺭﺯﻣﻴﺎﺕ ﻳﺘﻢ ﺗﻌﻠﻤﻬﺎ ﻣﻦ ﺧﻼﻝ‬ ‫ﻥ‬ ‫ﻣﺮﺍﻗﺒﺔﺍﻟﺒﻴﺉﺔ‪ ،‬ﻭﺍﺧﺘﻴﺎﺭ ﺍﻹﺟﺮﺍءﺍﺕ ﻭﺗﻨﻔﻴﺬﻫﺎ‪ ،‬ﻭﻓﻲ ﺍﻟﻤﻘﺎﺑﻞ ﺍﻟﺤﺼﻮﻝ ﻋﻠﻰ‬ ‫ﺍﻟﻤﻜﺎﻓﺂﺕﺃﻭﺍﻟﻌﻘﻮﺑﺎﺕﻓﻲ ﺷﻜﻞ ﻣﻜﺎﻓﺂﺕ ﺳﻠﺒﻴﺔ‪.‬‬ ‫ﺃﻣﺜﻠﺔﻋﻠﻰ ﺍﻟﺘﻌﻠﻢ ﺍﻟﺘﻌﺰﻳﺰﻱ‪ :‬ﺍﻟﺴﻴﺎﺭﺍﺕ ﺫﺍﺗﻴﺔ ﺍﻟﻘﻴﺎﺩﺓ‪ ،‬ﻭﺍﻟﺮﻭﺑﻮﺗﺎﺕ ﺍﻟﺬﻛﻴﺔ‪ ،‬ﻭﻣﺎ‬ ‫ﻥ‬ ‫ﺇﻟﻰﺫﻟﻚ‪.‬ﻟﻘﺪ ﺗﻐﻠﺒﺖ ﺷﺮﻛﺔ ‪ DeepMind‬ﺍﻟﺘﺎﺑﻌﺔ ﻟﺸﺮﻛﺔ ‪ Google‬ﻋﻠﻰ‬ ‫‪6‬‬ ‫ﻻﻋﺐ‪ Go‬ﺍﻟﻤﺼﻨﻒ ﺍﻷﻭﻝ ﻓﻲ ﺍﻟﻌﺎﻟﻢ ‪ Ke Jie‬ﻓﻲ ﻋﺎﻡ ‪.2017‬‬ ‫ﻥﺑﻌﺾ ﺍﻷﻣﺜﻠﺔ ﻋﻠﻰ ﺍﻟﺘﻌﻠﻢ ﺍﻟﺨﺎﺿﻊ ﻟﻺﺷﺮﺍﻑ ﻫﻲ‬ ‫ﺍﻟﺘﻨﺒﺆﺑﻨﺘﺎﺉﺞ ﻣﺒﺎﺭﺍﺓ ﻛﺮﺓ ﺍﻟﻘﺪﻡ‪.‬ﺍﻟﺘﻨﺒﺆ ﺑﻤﺎ ﺇﺫﺍ ﻛﺎﻥ ﺍﻟﻮﺭﻡ ﺧﺒﻴﺜﺎً ﺃﻡ‬ ‫ﺃ‬ ‫ﺣﻤﻴﺪﺍً‪.‬ﺍﻟﺘﻨﺒﺆ ﺑﺴﻌﺮ ﺍﻟﻤﻨﺎﺯﻝ‪.‬‬ ‫ﺃ‬ ‫ﺃ‬ ‫ﺗﺼﻨﻴﻒﺭﺳﺎﺉﻞ ﺍﻟﺒﺮﻳﺪ ﺍﻹﻟﻜﺘﺮﻭﻧﻲ ﻛﺮﺳﺎﺉﻞ ﻏﻴﺮ ﻣﺮﻏﻮﺏ ﻓﻴﻬﺎ ﺃﻭ ﻏﻴﺮ ﻣﺮﻏﻮﺏ ﻓﻴﻬﺎ‬ ‫ﺃ‬ ‫ﻋﻨﺪﻣﺎﻧﺤﺎﻭﻝ ﺍﻟﺘﻨﺒﺆ ﺑﻔﺉﺔ ﻋﻴﻨﺔ ﺑﻴﺎﻧﺎﺕ ﺟﺪﻳﺪﺓ‪ ،‬ﺗﻌُﺮﻑ ﺍﻟﻤﺸﻜﻠﺔ ﺑﺎﺳﻢ‬ ‫ﻥ‬ ‫ﺗﺼﻨﻴﻒﺗﺘﻀﻤﻦ ﺑﻌﺾ ﻣﺸﻜﻼﺕ ﺍﻟﺘﺼﻨﻴﻒ ﺍﻟﻨﻤﻮﺫﺟﻴﺔ ﻣﺎ ﻳﻠﻲ‪:‬‬ ‫ﺗﺼﻨﻴﻒﺍﻟﺼﻮﺭ‬ ‫ﺃ‬ ‫ﺍﻟﺘﻨﺒﺆﺑﺎﻟﻤﺮﺽ‬ ‫ﺃ‬ ‫ﺍﻟﺘﻨﺒﺆﺑﺎﻟﻔﻮﺯ ﻭﺍﻟﺨﺴﺎﺭﺓ ﻓﻲ ﺍﻟﻤﺒﺎﺭﻳﺎﺕ‬ ‫ﺃ‬ ‫ﺍﻟﺘﻨﺒﺆﺑﺎﻟﻜﻮﺍﺭﺙ ﺍﻟﻄﺒﻴﻌﻴﺔ ﻣﺜﻞ ﺍﻟﺰﻻﺯﻝ ﻭﺍﻟﻔﻴﻀﺎﻧﺎﺕ ﻭﻣﺎ ﺇﻟﻰ ﺫﻟﻚ‪.‬‬ ‫ﺃ‬ ‫ﺍﻟﺘﻌﺮﻑﻋﻠﻰ ﺍﻟﻜﺘﺎﺑﺔ ﺍﻟﻴﺪﻭﻳﺔ‬ ‫ﺃ‬ ‫ﺍﻟﺘﻌﺮﻑﻋﻠﻰ ﺭﻗﻢ ﻟﻮﺣﺔ ﺍﻟﺴﻴﺎﺭﺓ‬ ‫ﺃ‬ ‫‪7‬‬ ‫ﻓﻲﺣﻴﻦ ﺃﻧﻪ ﻋﻨﺪﻣﺎ ﻧﺤﺎﻭﻝ ﺍﻟﺘﻨﺒﺆ ﺑﻘﻴﻤﺔ ﺣﻘﻴﻘﻴﺔ ﻟﻌﻴﻨﺔ ﺑﻴﺎﻧﺎﺕ ﺟﺪﻳﺪﺓ‪،‬‬ ‫ﻥ‬ ‫ﺗﻌُﺮﻑﺍﻟﻤﺸﻜﻠﺔ ﺑﺎﺳﻢﺍﻻﻧﺤﺪﺍﺭ‪.‬ﺗﺘﻀﻤﻦ ﺑﻌﺾ ﻣﺸﺎﻛﻞ ﺍﻻﻧﺤﺪﺍﺭ‬ ‫ﺍﻟﻨﻤﻮﺫﺟﻴﺔﻣﺎ ﻳﻠﻲ‪:‬‬ ‫ﺍﻟﺘﻨﺒﺆﺑﺎﻟﻄﻠﺐ ﻓﻲ ﺗﺠﺎﺭﺓ ﺍﻟﺘﺠﺰﺉﺔ‬ ‫ﺃ‬ ‫ﺍﻟﺘﻨﺒﺆﺑﺎﻟﻤﺒﻴﻌﺎﺕ ﻟﻠﻤﺪﻳﺮﻳﻦ ﺍﻟﺘﻨﺒﺆ‬ ‫ﺃ‬ ‫ﺑﺎﻷﺳﻌﺎﺭﻓﻲ ﺍﻟﻌﻘﺎﺭﺍﺕ ﺗﻮﻗﻌﺎﺕ ﺍﻟﻄﻘﺲ‬ ‫ﺃ‬ ‫ﺃ‬ ‫ﺗﻮﻗﻌﺎﺕﺍﻟﻄﻠﺐ ﻋﻠﻰ ﺍﻟﻤﻬﺎﺭﺍﺕ ﻓﻲ ﺳﻮﻕ ﺍﻟﻌﻤﻞ‬ ‫ﺃ‬ ‫ﺍﻟﺘﺠﻤﻴﻊﻫﻮ ﺍﻟﻨﻮﻉ ﺍﻟﺮﺉﻴﺴﻲ ﻣﻦ ﺍﻟﺘﻌﻠﻢ ﻏﻴﺮ ﺍﻟﺨﺎﺿﻊ ﻟﻺﺷﺮﺍﻑ‪.‬ﻭﻳﻬﺪﻑ‬ ‫ﻥ‬ ‫ﺇﻟﻰﺗﺠﻤﻴﻊ ﺃﻭ ﺗﺠﻤﻴﻊ ﺍﻟﻜﺎﺉﻨﺎﺕ ﺍﻟﻤﺘﺸﺎﺑﻬﺔ ﻣﻌﺎً ﺩﺍﺧﻞ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬ﻭﻟﻬﺬﺍ‬ ‫ﺍﻟﺴﺒﺐ‪،‬ﺗﻜﻮﻥ ﺍﻟﻜﺎﺉﻨﺎﺕ ﺍﻟﺘﻲ ﺗﻨﺘﻤﻲ ﺇﻟﻰ ﻧﻔﺲ ﺍﻟﻤﺠﻤﻮﻋﺔ ﻣﺘﺸﺎﺑﻬﺔ‬ ‫ﺗﻤﺎﻣﺎًﻣﻊ ﺑﻌﻀﻬﺎ ﺍﻟﺒﻌﺾ ﺑﻴﻨﻤﺎ ﺗﻜﻮﻥ ﺍﻟﻜﺎﺉﻨﺎﺕ ﺍﻟﺘﻲ ﺗﻨﺘﻤﻲ ﺇﻟﻰ‬ ‫ﻣﺠﻤﻮﻋﺎﺕﻣﺨﺘﻠﻔﺔ ﻣﺨﺘﻠﻔﺔ ﺗﻤﺎﻣﺎً‪.‬ﺗﺘﻀﻤﻦ ﺑﻌﺾ ﻣﺸﻜﻼﺕ ﺍﻻﻧﺤﺪﺍﺭ‬ ‫ﺍﻟﻨﻤﻮﺫﺟﻴﺔﻣﺎ ﻳﻠﻲ‪:‬‬ ‫ﺗﺼﻨﻴﻒﺍﻟﺠﺮﺍﺉﻢ ﻓﻲ ﺍﻟﻴﻤﻦ ﺣﺴﺐ )ﺍﻟﻌﻤﺮ‪ ،‬ﺍﻟﺘﻌﻠﻴﻢ‪ ،‬ﺍﻟﻤﻨﻄﻘﺔ(‪ ،‬ﺗﺼﻨﻴﻒ ﻣﺸﺘﺮﻛﻲ‬ ‫ﺃ‬ ‫ﺧﺪﻣﺔﺍﻟﺒﺚ‬ ‫ﺃ‬ ‫‪8‬‬ ‫ﻧﻴﻨﻎ‬ ‫ﺃﻧﻮﺍﻉﺍﻟﺪﺍ‬ ‫ﺍﻟﺤﺒﺎﻝ‪.‬‬ ‫ﻥﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻫﻲ ﻋﺒﺎﺭﺓ ﻋﻦ ﻣﺠﻤﻮﻋﺔ‬ ‫ﺱ‪:‬‬ ‫ﻥﺩ‬ ‫ﺃ‬ ‫ﺃ‬ ‫‪9‬‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻟﻨﻮﻋﻴﺔﻫﻲ ﻣﻌﻠﻮﻣﺎﺕ ﺣﻮﻝ ﺟﻮﺩﺓ ﺍﻟﻜﺎﺉﻦ ﻭﺍﻟﺘﻲ ﻻ ﻳﻤﻜﻦ‬ ‫ﻥ‬ ‫ﻗﻴﺎﺳﻬﺎ‪.‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺍﺳﻢ ﺃﻭ ﺭﻗﻢ ﻗﻴﺪ ﺍﻟﻄﻼﺏ‪.‬ﻭﺇﺫﺍ ﻧﻈﺮﻧﺎ ﺇﻟﻰ ﺃﺩﺍء ﺍﻟﻄﻼﺏ‬ ‫ﺃ‬ ‫)ﻣﻦ ﺣﻴﺚ "ﺟﻴﺪ" ﻭ"ﻣﺘﻮﺳﻂ" ﻭ"ﺿﻌﻴﻒ"(‪ ،‬ﻓﻬﺬﻩ ﻣﻌﻠﻮﻣﺎﺕ ﻻ ﻳﻤﻜﻦ‬ ‫ﻗﻴﺎﺳﻬﺎﺑﺎﺳﺘﺨﺪﺍﻡ ﻣﻘﻴﺎﺱ ﻣﺎ‪.‬‬ ‫ﻳﻤﻜﻦﺗﻘﺴﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻨﻮﻋﻴﺔ ﺇﻟﻰ ﻧﻮﻋﻴﻦ ﻋﻠﻰ ﺍﻟﻨﺤﻮ ﺍﻟﺘﺎﻟﻲ‪:‬‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻹﺳﻤﻴﺔ‬ ‫ﺃ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻟﺘﺮﺗﻴﺒﻴﺔ‬ ‫ﺃ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻹﺳﻤﻴﺔﻫﻲ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺘﻲ ﻟﻴﺲ ﻟﻬﺎ ﻗﻴﻤﺔ ﻋﺪﺩﻳﺔ‪ ،‬ﻭﻟﻜﻦ ﻟﻬﺎ‬ ‫ﻥ‬ ‫ﻗﻴﻤﺔﻣﺴﻤﺎﺓ‪.‬ﻭﻣﻦ ﺃﻣﺜﻠﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻻﺳﻤﻴﺔ‪:‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﻓﺼﻴﻠﺔ ﺍﻟﺪﻡ‪ ،A، B، O، AB :‬ﺇﻟﺦ‪.‬ﺍﻟﺠﻨﺴﻴﺔ‪:‬‬ ‫ﺃ‬ ‫ﻳﻤﻨﻴﺔ‪،‬ﻫﻨﺪﻳﺔ‪ ،‬ﺃﻣﺮﻳﻜﻴﺔ‪ ،‬ﺑﺮﻳﻄﺎﻧﻴﺔ‪ ،‬ﺇﻟﺦ‪.‬ﺍﻟﺠﻨﺲ‪ :‬ﺫﻛﺮ‪ ،‬ﺃﻧﺜﻰ‬ ‫ﺃ‬ ‫ﺃ‬ ‫‪10‬‬ ‫ﻣﻦﺍﻟﻮﺍﺿﺢ ﺃﻧﻪ ﻻ ﻳﻤﻜﻦ ﺇﺟﺮﺍء ﺍﻟﻌﻤﻠﻴﺎﺕ ﺍﻟﺤﺴﺎﺑﻴﺔ )ﺍﻟﺠﻤﻊ ﻭﺍﻟﻄﺮﺡ‬ ‫ﻥ‬ ‫ﻭﺍﻟﻀﺮﺏﻭﻣﺎ ﺇﻟﻰ ﺫﻟﻚ( ﻋﻠﻰ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻻﺳﻤﻴﺔ‪.‬ﻭﻟﻬﺬﺍ ﺍﻟﺴﺒﺐ‪ ،‬ﻻ‬ ‫ﻳﻤﻜﻦﺃﻳﻀﺎً ﺗﻄﺒﻴﻖ ﺍﻟﻮﻇﺎﺉﻒ ﺍﻹﺣﺼﺎﺉﻴﺔ ﻣﺜﻞ ﺍﻟﻤﺘﻮﺳﻂ ﻭﺍﻟﺘﺒﺎﻳﻦ‬ ‫ﻭﻣﺎﺇﻟﻰ ﺫﻟﻚ ﻋﻠﻰ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻻﺳﻤﻴﺔ‪.‬ﻭﻣﻊ ﺫﻟﻚ‪ ،‬ﻣﻦ ﺍﻟﻤﻤﻜﻦ ﺇﺟﺮﺍء ﻋﺪ‬ ‫ﺃﺳﺎﺳﻲ‪.‬ﻭﺑﺎﻟﺘﺎﻟﻲ ﻳﻤﻜﻦ ﺗﺤﺪﻳﺪ ﺍﻟﻤﻨﻮﺍﻝ ﻟﻠﺒﻴﺎﻧﺎﺕ ﺍﻻﺳﻤﻴﺔ‪.‬‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻟﺘﺮﺗﻴﺒﻴﺔ‪ ،‬ﻫﻮ ﺍﻟﺬﻱ ﻟﺪﻳﻪﻻﻗﻴﻤﺔ ﻋﺪﺩﻳﺔﻟﻜﻦ ﻳﻤﻜﻦ ﺗﺮﺗﻴﺒﻬﺎ‬ ‫ﻥ‬ ‫ﺑﺸﻜﻞﻃﺒﻴﻌﻲ )ﻳﻤﻜﻨﻨﺎ ﺃﻥ ﻧﻘﻮﻝ ﻣﺎ ﺇﺫﺍ ﻛﺎﻧﺖ ﺍﻟﻘﻴﻤﺔ ﺃﻓﻀﻞ ﺃﻭ ﺃﻛﺒﺮ ﻣﻦ‬ ‫ﻗﻴﻤﺔﺃﺧﺮﻯ(‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺭﺿﺎ ﺍﻟﻌﻤﻼء‪" :‬ﺳﻌﻴﺪ ﺟﺪﺍً"‪" ،‬ﺳﻌﻴﺪ"‪" ،‬ﻏﻴﺮ‬ ‫ﺃ‬ ‫ﺳﻌﻴﺪ"‪،‬‬ ‫ﺍﻟﺪﺭﺟﺎﺕ‪':‬ﻣﻤﺘﺎﺯ'‪' ،‬ﺟﻴﺪ ﺟﺪﺍً'‪' ،‬ﺟﻴﺪ'‪' ،‬ﺭﺩﻱء' ﻭ'ﺭﺍﺳﺐ' ﺻﻼﺑﺔ ﺍﻟﻤﻌﺪﻥ‪' :‬‬ ‫ﺃ‬ ‫ﺻﻌﺐﺟﺪﺍً'‪' ،‬ﺻﻠﺐ'‪' ،‬ﻧﺎﻋﻢ'‪ ،‬ﺇﻟﺦ‪.‬‬ ‫ﺃ‬ ‫ﻣﺜﻞﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻻﺳﻤﻴﺔ‪ ،‬ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻷﺳﺎﺳﻴﺔﻭﺿﻊﻳﻤﻜﻦ ﺗﻄﺒﻴﻖ ﺫﻟﻚ‪.‬‬ ‫ﻥ‬ ‫ﻧﻈﺮﺍًﻷﻥ ﺍﻟﺘﺮﺗﻴﺐ ﻣﻤﻜﻦ ﻓﻲ ﺣﺎﻟﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺘﺮﺗﻴﺒﻴﺔ‪،‬ﻣﺘﻮﺳﻂ‪ ،‬ﻭ‬ ‫ﺍﻷﺭﺑﺎﻉﻳﻤﻜﻦ ﺗﻄﺒﻴﻘﻬﺎ ﺃﻳﻀﺎً‪.‬ﻻ ﻳﺰﺍﻝ ﻣﻦ ﻏﻴﺮ ﺍﻟﻤﻤﻜﻦ ﺣﺴﺎﺏ‬ ‫‪11‬‬ ‫ﺍﻟﻤﺘﻮﺳﻂﻭﺍﻟﺘﺒﺎﻳﻦ‪.‬‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻟﻜﻤﻴﺔﻫﻲ ﻣﻌﻠﻮﻣﺎﺕ ﺣﻮﻝ ﻛﻤﻴﺔ ﺷﻲء ﻣﺎ ﻳﻤﻜﻦ ﻗﻴﺎﺳﻬﺎ )‬ ‫ﻥ‬ ‫ﻣﺮﺗﺒﺔﺑﺸﻜﻞ ﻭﺍﺿﺢ(‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﻳﻤﻜﻦ ﻗﻴﺎﺱ ﺳﻤﺔ "ﺍﻟﻌﻼﻣﺎﺕ" ﺑﺎﺳﺘﺨﺪﺍﻡ ﻣﻘﻴﺎﺱ‬ ‫ﺃ‬ ‫ﺍﻟﻘﻴﺎﺱ‪.‬‬ ‫ﻫﻨﺎﻙﻧﻮﻋﺎﻥ ﻣﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻜﻤﻴﺔ‪:‬‬ ‫ﻥ‬ ‫ﺑﻴﺎﻧﺎﺕﺍﻟﻔﺎﺻﻞ ﺍﻟﺰﻣﻨﻲ‬ ‫ﺃ‬ ‫ﺑﻴﺎﻧﺎﺕﺍﻟﻨﺴﺒﺔ‬ ‫ﺃ‬ ‫ﺑﻴﺎﻧﺎﺕﺍﻟﻔﺎﺻﻞ ﺍﻟﺰﻣﻨﻲﻫﻲ ﺑﻴﺎﻧﺎﺕ ﻛﻤﻴﺔ ﺣﻴﺚ ﻳﻜﻮﻥ ﺍﻟﻔﺮﻕ‬ ‫ﻥ‬ ‫ﺍﻟﺪﻗﻴﻖﺑﻴﻦ ﻗﻴﻤﻬﺎ ﻣﻌﺮﻭﻓﺎً ﺃﻳﻀﺎً‪.‬ﻟﻜﻦﻻ ﻳﻮﺟﺪ ﻟﺪﻱﺻﻔﺮ ﻣﻄﻠﻖ')‬ ‫ﻧﻘﻄﺔﺍﻟﺼﻔﺮ ﺫﺍﺕ ﺍﻟﻤﻌﻨﻰ(‪.‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺩﺭﺟﺔ ﺍﻟﺤﺮﺍﺭﺓ ﻋﺒﺎﺭﺓ ﻋﻦ ﺑﻴﺎﻧﺎﺕ ﻓﺎﺻﻠﺔ‪.‬ﺍﻟﻔﺮﻕ ﺑﻴﻦ ‪12‬‬ ‫ﺃ‬ ‫ﺩﺭﺟﺔﻣﺉﻮﻳﺔ ﻭ‪ 18‬ﺩﺭﺟﺔ ﻣﺉﻮﻳﺔ ﻫﻮ ‪ 6‬ﺩﺭﺟﺎﺕ ﻣﺉﻮﻳﺔ ﻣﺜﻞ ﺍﻟﻔﺮﻕ ﺑﻴﻦ ‪15.5‬‬ ‫ﺩﺭﺟﺔﻣﺉﻮﻳﺔ ﻭ‪ 21.5‬ﺩﺭﺟﺔ ﻣﺉﻮﻳﺔ‪.‬‬ ‫ﺩﺭﺟﺔﺣﺮﺍﺭﺓ ﺻﻔﺮ ﺩﺭﺟﺔ ﻻ ﺗﻌﻨﻲ ﺃﻧﻪ ﻻ ﺗﻮﺟﺪ ﺩﺭﺟﺔ ﺣﺮﺍﺭﺓ )ﺃﻭ ﻻ ﺗﻮﺟﺪ ﺣﺮﺍﺭﺓ‬ ‫ﺃ‬ ‫ﻋﻠﻰﺍﻹﻃﻼﻕ( ‪ -‬ﺑﻞ ﺗﻌﻨﻲ ﻓﻘﻂ ﺃﻥ ﺩﺭﺟﺔ ﺍﻟﺤﺮﺍﺭﺓ ﺃﻗﻞ ﺑـ ‪ 10‬ﺩﺭﺟﺎﺕ ﻣﻦ‬ ‫‪.10‬‬ ‫‪12‬‬ ‫ﻭﺗﺸﻤﻞﺍﻷﻣﺜﻠﺔ ﺍﻷﺧﺮﻯ ﺍﻟﺘﺎﺭﻳﺦ ﻭﺍﻟﻮﻗﺖ ﻭﻣﺎ ﺇﻟﻰ ﺫﻟﻚ‪.‬‬ ‫ﺃ‬ ‫ﺑﺎﻟﻨﺴﺒﺔﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻔﻮﺍﺻﻞ ﺍﻟﺰﻣﻨﻴﺔ‪ ،‬ﻣﻦ ﺍﻟﻤﻤﻜﻦ ﺇﺟﺮﺍء ﻋﻤﻠﻴﺎﺕ ﺣﺴﺎﺑﻴﺔ ﻣﺜﻞ‬ ‫ﻥ‬ ‫ﺍﻟﺠﻤﻊﻭﺍﻟﻄﺮﺡ‪.‬ﻭﻟﻬﺬﺍ ﺍﻟﺴﺒﺐ‪ ،‬ﺑﺎﻟﻨﺴﺒﺔ ﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻔﻮﺍﺻﻞ ﺍﻟﺰﻣﻨﻴﺔ‪ ،‬ﻣﻦ‬ ‫ﺍﻟﻤﻤﻜﻦﻗﻴﺎﺱ ﺍﻟﻤﺘﻮﺳﻂ ﺍﻟﺤﺴﺎﺑﻲ ﻭﺍﻟﻮﺳﻴﻂ ﻭﺍﻟﻤﻨﻮﺍﻝ ﻭﺍﻻﻧﺤﺮﺍﻑ‬ ‫ﺍﻟﻤﻌﻴﺎﺭﻱﻭﻣﺎ ﺇﻟﻰ ﺫﻟﻚ‪.‬‬ ‫ﺑﻴﺎﻧﺎﺕﺍﻟﻨﺴﺒﺔﻫﻲ ﺑﻴﺎﻧﺎﺕ ﻛﻤﻴﺔ ﺣﻴﺚ ﻳﻜﻮﻥ ﺍﻟﻔﺮﻕ ﺍﻟﺪﻗﻴﻖ ﺑﻴﻦ‬ ‫ﻥ‬ ‫ﺍﻟﻘﻴﻢﻣﻌﺮﻭﻓﺎً ﻭﻟﻬﺎ ﺃﻳﻀﺎً "ﺻﻔﺮ ﻣﻄﻠﻖ' ‪.‬‬ ‫ﺑﺎﻟﻨﺴﺒﺔﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻨﺴﺒﺔ‪ ،‬ﻣﻦ ﺍﻟﻤﻤﻜﻦ ﺇﺟﺮﺍء ﻋﻤﻠﻴﺎﺕ ﺣﺴﺎﺑﻴﺔ ﻣﺜﻞ‬ ‫ﻥ‬ ‫ﺍﻟﺠﻤﻊﻭﺍﻟﻄﺮﺡ‪.‬ﻭﻳﻤﻜﻦ ﻗﻴﺎﺱ ﺍﻟﻤﺘﻮﺳﻂ ﻭﺍﻟﻮﺳﻴﻂ ﻭﺍﻟﻤﻨﻮﺍﻝ‬ ‫ﻭﺍﻻﻧﺤﺮﺍﻑﺍﻟﻤﻌﻴﺎﺭﻱ‪.‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺍﻟﺴﻤﺎﺕ ﻣﺜﻞ ﺍﻟﻄﻮﻝ ﻭﺍﻟﻮﺯﻥ ﻭﺍﻟﻌﻤﺮ ﻭﺍﻟﺮﺍﺗﺐ ﻭﻣﺎ‬ ‫ﺃ‬ ‫ﺇﻟﻰﺫﻟﻚ ﻫﻲ ﺑﻴﺎﻧﺎﺕ ﻧﺴﺒﻴﺔ‪.‬‬ ‫‪13‬‬ 14 ‫ﺍﺳﺘﻜﺸﺎﻑﺑﻨﻴﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﺍﺳﺘﻜﺸﺎﻑﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻜﻤﻴﺔ‬ ‫ﻓﻬﻢﺍﻻﺗﺠﺎﻩ ﺍﻟﻤﺮﻛﺰﻱ‬ ‫ﺗﺴﺎﻋﺪﻣﻘﺎﻳﻴﺲ ﺍﻻﺗﺠﺎﻩ ﺍﻟﻤﺮﻛﺰﻱ ﻋﻠﻰ ﻓﻬﻢ ﺍﻟﻨﻘﻄﺔ ﺍﻟﻤﺮﻛﺰﻳﺔ‬ ‫ﻥ‬ ‫ﻟﻤﺠﻤﻮﻋﺔﻣﻦ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫ﻳﻘﺼﺪ‪ :‬ﻫﻮ ﻣﺠﻤﻮﻉ ﻛﻞ ﻗﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻣﻘﺴﻮﻣﺎً ﻋﻠﻰ ﻋﺪﺩ ﻋﻨﺎﺻﺮ‬ ‫ﺃ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﻣﺘﻮﺳﻂ‪:‬ﻋﻠﻰ ﺍﻟﻌﻜﺲ ﻣﻦ ﺫﻟﻚ‪ ،‬ﻓﺈﻥ ﺍﻟﻮﺳﻴﻂ ﻫﻮ ﻗﻴﻤﺔ ﺍﻟﻌﻨﺼﺮ ﺍﻟﺬﻱ‬ ‫ﺃ‬ ‫ﻳﻈﻬﺮﻓﻲ ﻣﻨﺘﺼﻒ ﻗﺎﺉﻤﺔ ﻣﺮﺗﺒﺔ ﻣﻦ ﻋﻨﺎﺻﺮ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫ﻗﺪﻳﻜﻮﻥ ﻫﻨﺎﻙ ﺗﺴﺎﺅﻝ ﺣﻮﻝ ﺳﺒﺐ ﻣﺮﺍﺟﻌﺔ ﻣﻘﻴﺎﺳﻴﻦ ﻟﻼﺗﺠﺎﻩ ﺍﻟﻤﺮﻛﺰﻱ‪.‬‬ ‫ﻥ‬ ‫ﻭﺍﻟﺴﺒﺐﻫﻮ ﺃﻥ ﺍﻟﻤﺘﻮﺳﻂ ﻭﺍﻟﻮﺳﻴﻂ ﻳﺘﺄﺛﺮﺍﻥ ﺑﺸﻜﻞ ﻣﺨﺘﻠﻒ ﺑﻘﻴﻢ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺍﻟﺘﻲ ﺗﻈﻬﺮ ﻓﻲ ﺑﺪﺍﻳﺔ ﺍﻟﻨﻄﺎﻕ ﺃﻭ ﻧﻬﺎﻳﺘﻪ‪.‬‬ ‫‪15‬‬ ‫ﺍﻟﻤﺘﻮﺳﻂﺣﺴﺎﺱ ﻟﻠﻐﺎﻳﺔ ﻟﻠﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ )ﺍﻟﻘﻴﻢ ﺍﻟﺘﻲ ﺗﻜﻮﻥ ﻣﺮﺗﻔﻌﺔ‬ ‫ﻥ‬ ‫ﺃﻭﻣﻨﺨﻔﻀﺔ ﺑﺸﻜﻞ ﻏﻴﺮ ﻋﺎﺩﻱ‪ ،‬ﻣﻘﺎﺭﻧﺔ ﺑﺎﻟﻘﻴﻢ ﺍﻷﺧﺮﻯ(‪.‬‬ ‫ﺇﺫﺍﻻﺣﻈﻨﺎ ﺃﻧﻪ ﺑﺎﻟﻨﺴﺒﺔ ﻟﺒﻌﺾ ﺍﻟﺴﻤﺎﺕ‪،‬ﺍﻧﺤﺮﺍﻑ ﺇﺫﺍ ﻛﺎﻧﺖ ﻗﻴﻢ‬ ‫ﻥ‬ ‫ﺍﻟﻤﺘﻮﺳﻂﻭﺍﻟﻮﺳﻴﻂ ﻣﺮﺗﻔﻌﺔ ﺟﺪﺍً‪ ،‬ﻓﻴﺠﺐ ﻋﻠﻴﻨﺎ ﺍﻟﺘﺤﻘﻴﻖ ﻓﻲ ﻫﺬﻩ‬ ‫ﺍﻟﺴﻤﺎﺕﺑﺸﻜﻞ ﺃﻛﺒﺮ‪.‬‬ ‫‪Auto-MPG‬ﺍﻟﺴﻴﺎﺭﺍﺕ‪16‬‬ ‫ﺃﺩﺍءﺍﺳﺘﻬﻼﻙ ﺍﻟﻮﻗﻮﺩ ﻟﻜﻞ ﺟﺎﻟﻮﻥ ﻣﻦ ﺍﻟﺴﻴﺎﺭﺍﺕ ﺍﻟﻤﺨﺘﻠﻔﺔ ﺣﺴﺐ ﻧﻈﺎﻡ‬ ‫‪%‬‬ ‫ﺍﻻﻧﺤﺮﺍﻑﻣﻬﻢ ﺑﺎﻟﻨﺴﺒﺔ ﻟﻠﺴﻤﺎﺕ "ﺍﻷﺳﻄﻮﺍﻧﺎﺕ" ﻭ"ﺍﻹﺯﺍﺣﺔ" ﻭ"ﺍﻷﺻﻞ"‪.‬‬ ‫ﻥ‬ ‫ﻟﺬﺍ‪،‬ﻧﺤﺘﺎﺝ ﺇﻟﻰ ﺇﻟﻘﺎء ﻧﻈﺮﺓ ﺃﻋﻤﻖ ﻋﻠﻰ ﺑﻌﺾ ﺍﻹﺣﺼﺎﺉﻴﺎﺕ ﺍﻹﺿﺎﻓﻴﺔ‬ ‫ﻟﻬﺬﻩﺍﻟﺴﻤﺎﺕ‪.‬‬ ‫ﻫﻨﺎﻙﺃﻳﻀﺎً ﺑﻌﺾ ﺍﻟﻤﺸﺎﻛﻞ ﻓﻲ ﻗﻴﻢ ﺳﻤﺔ "ﻗﻮﺓ ﺍﻟﺤﺼﺎﻥ" ﻭﺍﻟﺘﻲ ﺑﺴﺒﺒﻬﺎ‬ ‫ﻥ‬ ‫ﻻﻳﻤﻜﻦ ﺣﺴﺎﺏ ﺍﻟﻤﺘﻮﺳﻂ ﻭﺍﻟﻮﺳﻴﻂ‪.‬‬ ‫‪17‬‬ ‫ﻓﻬﻢﺍﻧﺘﺸﺎﺭ ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﺍﻵﻥ‪،‬ﻟﺪﻳﻨﺎ ﻓﻜﺮﺓ ﻭﺍﺿﺤﺔ ﻋﻦ ﺍﻟﺴﻤﺎﺕ ﺍﻟﺘﻲ ﻳﻮﺟﺪ ﺑﻬﺎ ﺍﻧﺤﺮﺍﻑ ﻛﺒﻴﺮ‬ ‫ﻥ‬ ‫ﺑﻴﻦﺍﻟﻤﺘﻮﺳﻂ ﻭﺍﻟﻮﺳﻴﻂ‪.‬ﺩﻋﻮﻧﺎ ﻧﻨﻈﺮ ﻋﻦ ﻛﺜﺐ ﺇﻟﻰ ﺗﻠﻚ ﺍﻟﺴﻤﺎﺕ‬ ‫ﻓﻲﺷﻜﻞ‬ ‫ﺗﺸﺘﺖﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﺃ‬ ‫ﻣﻮﺿﻊﻗﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻤﺨﺘﻠﻔﺔ‬ ‫ﺃ‬ ‫ﺗﺸﺘﺖﺍﻟﺒﻴﺎﻧﺎﺕ‪:‬‬ ‫ﻥﺧﺬ ﻓﻲ ﺍﻻﻋﺘﺒﺎﺭ ﻗﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺨﺎﺻﺔ ﺑﺨﺎﺻﻴﺘﻴﻦ‪:‬‬ ‫ﺍﻟﻤﺘﻮﺳﻂ= ‪46‬‬ ‫ﻗﻴﻢﺍﻟﺴﻤﺔ ‪ ،45 ،48 ،46 ،44 :1‬ﻭ‪ 47‬ﻗﻴﻢ ﺍﻟﺴﻤﺔ ‪:2‬‬ ‫ﻥ‬ ‫ﺍﻟﻤﺘﻮﺳﻂ= ‪46‬‬ ‫‪ ،39 ،59 ،46،34‬ﻭ‪52‬‬ ‫ﻥ‬ ‫ﻭﻣﻊﺫﻟﻚ‪ ،‬ﻓﺈﻥ ﻣﺠﻤﻮﻋﺔ ﻗﻴﻢ ﺍﻟﺴﻤﺔ ‪ 1‬ﺗﺘﺮﻛﺰ ﺑﺸﻜﻞ ﺃﻛﺒﺮ ﺣﻮﻝ ﺍﻟﻘﻴﻤﺔ‬ ‫ﺍﻟﻤﺘﻮﺳﻄﺔﺑﻴﻨﻤﺎ ﺍﻟﻤﺠﻤﻮﻋﺔ ﺍﻟﺜﺎﻧﻴﺔ ﻣﻦ ﻗﻴﻢ ﺍﻟﺴﻤﺔ ‪ 2‬ﻣﻨﺘﺸﺮﺓ ﺃﻭ ﻣﺸﺘﺘﺔ‬ ‫ﺇﻟﻰﺣﺪ ﻛﺒﻴﺮ‪.‬‬ ‫‪18‬‬ ‫ﻟﻤﻌﺮﻓﺔﻣﺪﻯ ﺍﻧﺘﺸﺎﺭ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺨﺘﻠﻔﺔ ﻟﻠﺒﻴﺎﻧﺎﺕ‪ ،‬ﻳﺘﻢ ﻗﻴﺎﺱ ﺗﺒﺎﻳﻦ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﻋﻠﻰ ﺍﻟﻨﺤﻮ ﺍﻟﺘﺎﻟﻲ‪:‬‬ ‫)‬ ‫!=‬ ‫" ∑‪,+*%('&%‬‬ ‫‪-‬‬ ‫ﺣﻴﺚ ‪.‬ﻫﻮ ﻣﺘﻮﺳﻂ ﻋﻨﺎﺻﺮ ﺍﻟﺒﻴﺎﻧﺎﺕ‪ / ،‬ﻫﻮ ﻋﺪﺩﻫﺎ‬ ‫ﺗﺸﻴﺮﺍﻟﻘﻴﻤﺔ ﺍﻷﻛﺒﺮ ﻟﻠﺘﺒﺎﻳﻦ ﺇﻟﻰ ﻣﺰﻳﺪ ﻣﻦ ﺍﻟﺘﺸﺘﺖ ﻓﻲ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻭﺍﻟﻌﻜﺲ‬ ‫ﻥ‬ ‫ﺻﺤﻴﺢ‪.‬ﺑﺎﻟﻨﺴﺒﺔ ﻟﻠﻤﺜﺎﻝ ﺃﻋﻼﻩ‪،‬‬ ‫" =)‪,(28*22)5,(27*22)5,(26*22)5,(23*23)5,(23*22‬‬ ‫=‪6‬‬ ‫ﺃ !‪0‬‬ ‫‪7‬‬ ‫‪7)5,(23*23‬؛*‪:)5,(26‬؛*‪,(28*"7)5,(27‬‬ ‫)‪5 ( ),23*2:‬‬ ‫= ‪65.2‬‬ ‫!" "=‬ ‫ﺃ‬ ‫‪7‬‬ ‫ﻟﺬﺍﻓﻤﻦ ﺍﻟﻮﺍﺿﺢ ﺗﻤﺎﻣﺎً ﻣﻦ ﺍﻟﻘﻴﺎﺱ ﺃﻥ ﻗﻴﻢ ﺍﻟﺴﻤﺔ ‪ 1‬ﺗﺘﺮﻛﺰ ﺗﻤﺎﻣﺎً‬ ‫ﻥ‬ ‫ﺣﻮﻝﺍﻟﻤﺘﻮﺳﻂ ﺑﻴﻨﻤﺎ ﻗﻴﻢ ﺍﻟﺴﻤﺔ ‪ 2‬ﻣﺘﺒﺎﻋﺪﺓ ﻟﻠﻐﺎﻳﺔ‪.‬‬ ‫‪19‬‬ ‫ﻗﻴﺎﺱﻣﻮﺿﻊ ﻗﻴﻤﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‪:‬‬ ‫ﻳﺘﻢﺗﺮﺗﻴﺐ ﻗﻴﻢ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺨﺎﺻﺔ ﺑﺨﺎﺻﻴﺔ ﻣﺎ ﺑﺘﺮﺗﻴﺐ ﺗﺼﺎﻋﺪﻱ ﺛﻢ‬ ‫ﻥ‬ ‫ﺗﻘﺴﻴﻤﻬﺎﺇﻟﻰ ﻧﺼﻔﻴﻦ‪.‬ﻳﺘﻢ ﺗﻘﺴﻴﻢ ﻛﻞ ﻧﺼﻒ ﺇﻟﻰ ﻧﺼﻔﻴﻦ‪.‬‬ ‫ﺍﻟﺤﺪﺍﻷﻗﺼﻰ‬ ‫ﺱ‪2‬‬ ‫ﺍﻟﻤﺘﻮﺳﻂ)ﺍﻟﺮﺑﻊ ﺍﻟﺜﺎﻧﻲ(‬ ‫ﺱ‪1‬‬ ‫ﺍﻟﺤﺪﺍﻷﺩﻧﻰ‬ ‫ﻧﺤﻦﻧﻨﻈﺮ ﺇﻟﻰ ﺍﻟﻔﺮﻕ ﺑﻴﻦ ﺍﻷﺭﺑﺎﻉ )ﺍﻟﺤﺪ ﺍﻷﺩﻧﻰ ﻭ‪ Q1، Q1‬ﻭﺍﻟﻮﺳﻴﻂ‪،‬‬ ‫ﻥ‬ ‫ﺍﻟﻮﺳﻴﻂﻭ‪ Q2، Q2‬ﻭﺍﻟﺤﺪ ﺍﻷﻗﺼﻰ( ﺍﻟﻘﻴﻢ ﺍﻷﻛﺒﺮ ﺃﻛﺜﺮ ﺍﻧﺘﺸﺎﺭﺍً ﻣﻦ ﺍﻟﻘﻴﻢ‬ ‫ﺍﻷﺻﻐﺮ‪.‬‬ ‫ﻳﺴﺎﻋﺪﻫﺬﺍ ﻓﻲ ﻓﻬﻢ ﺳﺒﺐ ﻛﻮﻥ ﻗﻴﻤﺔ ﺍﻟﻤﺘﻮﺳﻂ ﺃﻋﻠﻰ ﺑﻜﺜﻴﺮ ﻣﻦ‬ ‫ﻥ‬ ‫ﻗﻴﻤﺔﺍﻟﻮﺳﻴﻂ ﻟﺨﺎﺻﻴﺔ "ﺍﻹﺯﺍﺣﺔ"‪.‬‬ ‫ﻭﻣﻊﺫﻟﻚ‪ ،‬ﻻ ﻳﺰﺍﻝ ﻣﻦ ﻏﻴﺮ ﺍﻟﻤﻤﻜﻦ ﺍﻟﺘﺄﻛﺪ ﻣﻦ ﻭﺟﻮﺩ ﺃﻱ ﻗﻴﻤﺔ ﺷﺎﺫﺓ ﻓﻲ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬ﻭﻟﺘﺤﻘﻴﻖ ﻫﺬﻩ ﺍﻟﻐﺎﻳﺔ‪ ،‬ﻳﻤﻜﻨﻨﺎ ﺍﺳﺘﺨﺪﺍﻡ ﺑﻌﺾ ﺍﻟﻮﺳﺎﺉﻞ‬ ‫‪20‬‬ ‫ﻟﺘﺼﻮﺭﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫ﺭﺳﻢﻭﺍﺳﺘﻜﺸﺎﻑ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺮﻗﻤﻴﺔ ﻣﺨﻄﻄﺎﺕ‬ ‫ﺍﻟﺼﻨﺪﻭﻕ‪:‬‬ ‫ﻳﻌﻄﻲﺍﻟﺮﺳﻢ ﺍﻟﺒﻴﺎﻧﻲ ﺍﻟﺼﻨﺪﻭﻗﻲ )ﻳﺴُﻤﻰ ﺃﻳﻀﺎً ﺍﻟﺮﺳﻢ ﺍﻟﺒﻴﺎﻧﻲ‬ ‫ﻥ‬ ‫ﺍﻟﺼﻨﺪﻭﻗﻲﻭﺍﻟﺸﺎﺭﺏ( ﺗﺼﻮﺭﺍً ﻗﻴﺎﺳﻴﺎً ﻹﺣﺼﺎﺉﻴﺎﺕ ﺍﻟﻤﻠﺨﺺ ﺍﻟﻤﻜﻮﻧﺔ‬ ‫ﻣﻦﺧﻤﺴﺔ ﺃﺭﻗﺎﻡ ﻟﻤﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ‪ ،‬ﻭﻫﻲ‪ :‬ﺍﻟﺤﺪ ﺍﻷﺩﻧﻰ‪ ،‬ﻭﺍﻟﺮﺑﻊ ﺍﻷﻭﻝ‬ ‫)‪ ،(Q1‬ﻭﺍﻟﻮﺳﻴﻂ )‪ ،(Q2‬ﻭﺍﻟﺮﺑﻊ ﺍﻟﺜﺎﻟﺚ )‪ ،(Q3‬ﻭﺍﻟﺤﺪ ﺍﻷﻗﺼﻰ‪.‬ﻓﻴﻤﺎ ﻳﻠﻲ‬ ‫ﺗﻔﺴﻴﺮﻣﻔﺼﻞ ﻟﻠﺮﺳﻢ ﺍﻟﺒﻴﺎﻧﻲ ﺍﻟﺼﻨﺪﻭﻗﻲ‪.‬‬ ‫‪21‬‬ ‫ﻳﺒﺪﻭﺍﻟﺮﺳﻢ ﺍﻟﺒﻴﺎﻧﻲ ﻟﻠﺼﻨﺪﻭﻕ ﺍﻟﺨﺎﺹ ﺑﺨﺎﺻﻴﺔ "ﺍﻷﺳﻄﻮﺍﻧﺎﺕ" ﻏﺮﻳﺒﺎً‬ ‫ﻥ‬ ‫ﺟﺪﺍًﻓﻲ ﺍﻟﺸﻜﻞ‪.‬ﻓﺎﻟﺸﺎﺭﺏ ﺍﻟﻌﻠﻮﻱ ﻣﻔﻘﻮﺩ‪ ،‬ﻭﺍﻟﻮﺳﻂ ﻳﻘﻊ ﻓﻲ ﺃﺳﻔﻞ‬ ‫ﺍﻟﺼﻨﺪﻭﻕ‪،‬ﻭﺣﺘﻰ ﺍﻟﺸﺎﺭﺏ ﺍﻟﺴﻔﻠﻲ ﺻﻐﻴﺮ ﺟﺪﺍً ﻣﻘﺎﺭﻧﺔ ﺑﻄﻮﻝ ﺍﻟﺼﻨﺪﻭﻕ!‬ ‫ﻫﻞﻛﻞ ﺷﻲء ﻋﻠﻰ ﻣﺎ ﻳﺮﺍﻡ؟‬ ‫ﺍﻟﺠﻮﺍﺏﻫﻮ ﻛﺒﻴﺮﻧﻌﻢ‪.‬ﺍﻟﺨﺎﺻﻴﺔ "ﺍﻷﺳﻄﻮﺍﻧﺎﺕ" ﻣﻨﻔﺼﻠﺔ ﺑﻄﺒﻴﻌﺘﻬﺎ‬ ‫ﻥ‬ ‫‪22‬‬ ‫ﻭﺗﺘﺮﺍﻭﺡﻗﻴﻤﻬﺎ ﻣﻦ ‪ 3‬ﺇﻟﻰ ‪.8‬‬ ‫ﺍﻟﻬﻴﺴﺘﻮﺟﺮﺍﻡ‪:‬‬ ‫ﻫﻮﺭﺳﻢ ﺑﻴﺎﻧﻲ ﻳﻮﺿﺢ ﺗﺮﺩﺩ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺮﻗﻤﻴﺔ ﺑﺎﺳﺘﺨﺪﺍﻡ ﻣﺴﺘﻄﻴﻼﺕ‬ ‫ﻣﺴﺎﺣﺘﻬﺎﻣﺘﻨﺎﺳﺒﺔ ﻣﻊ ﺗﺮﺩﺩ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻭﻋﺮﺿﻬﺎ ﻳﺴﺎﻭﻱ ﻓﺘﺮﺓ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫ﻗﺪﺗﻜﻮﻥ ﺍﻟﻬﺴﺘﻮﺟﺮﺍﻣﺎﺕ ﺫﺍﺕ ﺃﺷﻜﺎﻝ ﻣﺨﺘﻠﻔﺔ ﺍﻋﺘﻤﺎﺩﺍً ﻋﻠﻰ ﻃﺒﻴﻌﺔ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫‪23‬‬ ‫ﺕ‬ ‫ﻥ‬ ‫ﺕ‬ ‫ﺍﻧﺖ‬ ‫ﺃ‬ ‫ﻡ‬ ‫ﺍﺳﺘﻜﺸﺎﻑﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻨﻮﻋﻴﺔ‬ ‫ﻻﺗﻮﺟﺪ ﻫﻨﺎ ﺍﻟﻌﺪﻳﺪ ﻣﻦ ﺍﻟﺨﻴﺎﺭﺍﺕ ﻻﺳﺘﻜﺸﺎﻑ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﻨﻮﻋﻴﺔ‪.‬‬ ‫ﻥ‬ ‫ﺳﻨﻮﺿﺢﻋﺪﺩ ﺍﻟﻘﻴﻢ ﺍﻟﻔﺮﻳﺪﺓ ﺍﻟﻤﺘﺎﺣﺔ ﻟﻠﺴﻤﺔ‪.‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺑﺎﻟﻨﺴﺒﺔ ﻟﺨﺎﺻﻴﺔ 'ﺍﺳﻢ ﺍﻟﺴﻴﺎﺭﺓ'‬ ‫ﻥ‬ ‫‪.1‬ﺷﻴﻔﺮﻭﻟﻴﻪ ﺷﻴﻔﻴﻠﻲ ﻣﺎﻟﻴﺒﻮ‬ ‫‪.2‬ﺑﻴﻮﻙ ﺳﻜﺎﻱ ﻻﺭﻙ ‪320‬‬ ‫‪.3‬ﻗﻤﺮ ﺻﻨﺎﻋﻲ ﺑﻠﻴﻤﻮﺙ‬ ‫‪4. AMC Rebel SST‬‬ ‫‪.5‬ﻓﻮﺭﺩ ﺗﻮﺭﻳﻨﻮ‬ ‫‪.6‬ﻓﻮﺭﺩ ﺟﻼﻛﺴﻲ ‪500‬‬ ‫‪.7‬ﺷﻴﻔﺮﻭﻟﻴﻪ ﺇﻣﺒﺎﻻ‬ ‫‪.8‬ﺑﻠﻴﻤﻮﺙ ﻓﻴﻮﺭﻱ ‪3‬‬ ‫‪.9‬ﺑﻮﻧﺘﻴﺎﻙ ﻛﺎﺗﺎﻟﻴﻨﺎ‬ ‫‪.10‬ﺳﻔﻴﺮ ‪AMC dpl‬‬ ‫‪25‬‬ ‫ﻳﻤﻜﻨﻨﺎﺃﻳﻀﺎً ﺍﻟﺒﺤﺚ ﻋﻦ ﻣﺰﻳﺪ ﻣﻦ ﺍﻟﺘﻔﺎﺻﻴﻞ ﻭﺍﻟﺤﺼﻮﻝ ﻋﻠﻰ ﺟﺪﻭﻝ‬ ‫ﻥ‬ ‫ﻳﺤﺘﻮﻱﻋﻠﻰ ﻋﺪﺩ ﻋﻨﺎﺻﺮ ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﻗﺪﻧﻜﻮﻥ ﻣﻬﺘﻤﻴﻦ ﺃﻳﻀﺎً ﺑﻤﻌﺮﻓﺔ ﻧﺴﺒﺔ )ﺃﻭ ﻧﺴﺒﺔ ﻣﺉﻮﻳﺔ( ﻋﺪﺩ ﻋﻨﺎﺻﺮ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫‪26‬‬ ‫ﺍﺳﺘﻜﺸﺎﻑﺍﻟﻌﻼﻗﺔ ﺑﻴﻦ ﺍﻟﺴﻤﺎﺕ‬ ‫ﻣﺨﻄﻂﺍﻟﺘﺸﺘﺖ‪:‬‬ ‫ﺭﺳﻢﺑﻴﺎﻧﻲ ﺛﻨﺎﺉﻲ ﺍﻷﺑﻌﺎﺩ ﻳﺴﺎﻋﺪ ﻋﻠﻰ ﺗﺼﻮﺭ ﺍﻟﻌﻼﻗﺔ ﺑﻴﻦ ﺳﻤﺘﻴﻦ )‬ ‫ﻥ‬ ‫ﻣﺘﻐﻴﺮﻳﻦ(‪.‬‬ ‫ﺍﻟﺸﺎﺫ‬ ‫‪27‬‬ 28 ‫ﺟﻮﺩﺓﺍﻟﺒﻴﺎﻧﺎﺕ ﻭﻣﻌﺎﻟﺠﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‬ ‫ﺟﻮﺩﺓﺍﻟﺒﻴﺎﻧﺎﺕ‪:‬‬ ‫ﻳﻌﺘﻤﺪﻧﺠﺎﺡ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ ﺇﻟﻰ ﺣﺪ ﻛﺒﻴﺮ ﻋﻠﻰ ﺟﻮﺩﺓ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬ﺗﺴﺎﻋﺪ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕﺫﺍﺕ ﺍﻟﺠﻮﺩﺓ ﺍﻟﻤﻨﺎﺳﺒﺔ ﻓﻲ ﺗﺤﻘﻴﻖ ﺩﻗﺔ ﺃﻓﻀﻞ ﻓﻲ ﺍﻟﺘﻨﺒﺆ‪.‬‬ ‫ﻟﻘﺪﻭﺍﺟﻬﻨﺎ ﺑﺎﻟﻔﻌﻞ ﻧﻮﻋﻴﻦ ﻋﻠﻰ ﺍﻷﻗﻞ ﻣﻦ ﺍﻟﻤﺸﺎﻛﻞ‪:‬‬ ‫ﻥ‬ ‫ﻋﻨﺎﺻﺮﺍﻟﺒﻴﺎﻧﺎﺕ ﺑﺪﻭﻥ ﻗﻴﻢ ﺃﻭ ﺑﻴﺎﻧﺎﺕ ﺫﺍﺕ ﻗﻴﻢ ﻣﻔﻘﻮﺩﺓ‬ ‫ﺃ‬ ‫ﻋﻨﺎﺻﺮﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺘﻲ ﻟﻬﺎ ﻗﻴﻤﺔ ﻣﺨﺘﻠﻔﺔ ﻋﻦ ﺍﻟﻌﻨﺎﺻﺮ ﺍﻷﺧﺮﻯ‪،‬‬ ‫ﺃ‬ ‫ﻭﺍﻟﺘﻲﻧﻄﻠﻖ ﻋﻠﻴﻬﺎ "ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ"‪.‬‬ ‫‪29‬‬ ‫ﻣﻌﺎﻟﺠﺔﺍﻟﺒﻴﺎﻧﺎﺕ‪:‬‬ ‫ﺇﻥﺍﻟﻘﻀﺎﻳﺎ ﺍﻟﻤﺘﻌﻠﻘﺔ ﺑﺠﻮﺩﺓ ﺍﻟﺒﻴﺎﻧﺎﺕ‪ ،‬ﺍﻟﻤﺬﻛﻮﺭﺓ ﺃﻋﻼﻩ‪ ،‬ﺗﺤﺘﺎﺝ ﺇﻟﻰ‬ ‫ﻥ‬ ‫ﻣﻌﺎﻟﺠﺔﻟﺘﺤﻘﻴﻖ ﺍﻟﻘﺪﺭ ﺍﻟﻤﻨﺎﺳﺐ ﻣﻦ ﺍﻟﻜﻔﺎءﺓ‪.‬‬ ‫‪ (1‬ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ‬ ‫ﺍﻟﻘﻴﻢﺍﻟﻤﺘﻄﺮﻓﺔ ﻫﻲ ﻋﻨﺎﺻﺮ ﺑﻴﺎﻧﺎﺕ ﺫﺍﺕ ﻗﻴﻤﺔ ﻋﺎﻟﻴﺔ ﺑﺸﻜﻞ ﻏﻴﺮ ﻃﺒﻴﻌﻲ‬ ‫ﻥ‬ ‫ﻣﻤﺎﻗﺪ ﻳﺆﺛﺮ ﻋﻠﻰ ﺩﻗﺔ ﺍﻟﺘﻨﺒﺆ‪.‬‬ ‫ﺍﻛﺘﺸﺎﻑﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ‪:‬‬ ‫ﻫﻨﺎﻙﻋﺪﺩ ﻣﻦ ﺍﻟﺘﻘﻨﻴﺎﺕ ﻟﻠﻜﺸﻒ ﻋﻦ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ‪ ،‬ﻭﺳﻮﻑ‬ ‫ﻥ‬ ‫ﻧﻨﺎﻗﺶﺑﻌﻀﺎً ﻣﻨﻬﺎ‪:‬‬ ‫ﺭﺳﻢﺑﻴﺎﻧﻲ ﻟﻠﺼﻨﺪﻭﻕ‬ ‫ﺃ‬ ‫ﻣﺨﻄﻂﺍﻟﺘﺸﺘﺖ‬ ‫ﺃ‬ ‫ﺃﺳﻮﺍﺭﺗﻮﻛﻲ‬ ‫ﺃ‬ ‫ﺍﻟﻨﺘﻴﺠﺔ‪Z‬‬ ‫ﺃ‬ ‫‪30‬‬ ‫ﻃﺮﻳﻘﺔﺳﻴﺎﺝ ﺗﻮﻛﻲ‬ ‫ﻳﻌﺘﻤﺪﻋﻠﻰ ﺍﻟﻨﻄﺎﻕ ﺍﻟﺮﺑﻌﻲ ‪)IQR‬ﺍﻟﺮﺑﻊ ﺍﻟﺮﺍﺑﻊ=ﺍﻟﺮﺑﻊ ﺍﻟﺜﺎﻟﺚ‪-‬ﺍﻟﺮﺑﻊ ﺍﻷﻭﻝ(‬ ‫ﻥ‬ ‫ﻓﻲ‪ ،Tukey Fences‬ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ ﻫﻲ ﺍﻟﻘﻴﻢ ﺍﻟﺘﻲ‪:‬‬ ‫ﻥ‬ ‫ﺃﻗﻞﻣﻦ (‪ ،Q1 – )1.5 × IQR‬ﺃﻭ ﺃﻛﺜﺮ‬ ‫ﺃ‬ ‫ﻣﻦ (‪Q3 + )1.5 × IQR‬‬ ‫ﺃ‬ ‫ﻃﺮﻳﻘﺔﺍﻟﺪﺭﺟﺔ ﺍﻟﻤﻌﻴﺎﺭﻳﺔ‬ ‫ﺗﺸﻴﺮﺍﻟﺪﺭﺟﺔ ﺍﻟﻤﻌﻴﺎﺭﻳﺔ ﺇﻟﻰ ﻋﺪﺩ ﺍﻻﻧﺤﺮﺍﻓﺎﺕ ﺍﻟﻤﻌﻴﺎﺭﻳﺔ ﺍﻟﺘﻲ ﺗﻔﺼﻞ ﻧﻘﻄﺔ‬ ‫ﻥ‬ ‫ﺑﻴﺎﻧﺎﺕﻋﻦ ﺍﻟﻤﺘﻮﺳﻂ‪.‬ﺗﺤﺘﻮﻱ ﺍﻟﺪﺭﺟﺔ ﺍﻟﻤﻌﻴﺎﺭﻳﺔ ﻋﻠﻰ ﺍﻟﺼﻴﻐﺔ ﺍﻟﺘﺎﻟﻴﺔ‪:‬‬ ‫‪/‬‬ ‫ﺃﻧﺎ‬ ‫ﺱ‬ ‫ﺃﻳﻦ‪x‬ﺃﻧﺎﻫﻲ ﻧﻘﻄﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‪ ،‬ﻭ‪ μ‬ﻫﻲ ﻣﺘﻮﺳﻂ ﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‪ ،‬ﻭ‪ σ‬ﻫﻲ‬ ‫ﺍﻻﻧﺤﺮﺍﻑﺍﻟﻤﻌﻴﺎﺭﻱ‪.‬‬ ‫ﻥﺗﻌﺘﺒﺮ ﺍﻟﺪﺭﺟﺔ ‪ Z‬ﻗﻴﻤﺔ ﺷﺎﺫﺓ ﺇﺫﺍ‬ ‫ﺃﻛﺒﺮﻣﻦ ‪ 3‬ﺃﻭ ﺃﻗﻞ ﻣﻦ‬ ‫ﺃ‬ ‫‪3-‬‬ ‫ﺃ‬ ‫‪31‬‬ ‫ﺍﻟﺘﻌﺎﻣﻞﻣﻊ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ‬ ‫ﺑﻤﺠﺮﺩﺗﺤﺪﻳﺪ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ ﻭﺍﺗﺨﺎﺫ ﺍﻟﻘﺮﺍﺭ ﺑﺈﺻﻼﺡ ﺗﻠﻚ ﺍﻟﻘﻴﻢ‪ ،‬ﻳﻤﻜﻨﻚ‬ ‫ﻥ‬ ‫ﺍﻟﺘﻔﻜﻴﺮﻓﻲ ﺃﺣﺪ ﺍﻷﺳﺎﻟﻴﺐ ﺍﻟﺘﺎﻟﻴﺔ‪:‬‬ ‫ﺇﺯﺍﻟﺔﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ‪:‬ﺇﺫﺍ ﻟﻢ ﻳﻜﻦ ﻋﺪﺩ ﺍﻟﺴﺠﻼﺕ ﺍﻟﻤﺘﻄﺮﻓﺔ ﻛﺒﻴﺮﺍً‪،‬‬ ‫ﺃ‬ ‫ﻓﻴﻤﻜﻨﻨﺎﺑﺒﺴﺎﻃﺔ ﺇﺯﺍﻟﺘﻬﺎ‪.‬‬ ‫ﺍﻹﺳﻨﺎﺩ‪:‬ﻫﻨﺎﻙ ﻃﺮﻳﻘﺔ ﺃﺧﺮﻯ ﻭﻫﻲ ﺇﺳﻨﺎﺩ ﺍﻟﻘﻴﻤﺔ ﺍﻟﻤﺘﻄﺮﻓﺔ ﺑﺎﺳﺘﺨﺪﺍﻡ‬ ‫ﺃ‬ ‫ﺍﻟﻤﺘﻮﺳﻂﺃﻭ ﺍﻟﻮﺳﻴﻂ ﺃﻭ ﺍﻟﻤﻨﻮﺍﻝ ﻟﺠﻤﻴﻊ ﻗﻴﻢ ﺍﻟﺴﻤﺎﺕ‪.‬‬ ‫ﺍﻟﺘﻐﻄﻴﺔ‪:‬ﻗﺪ ﻳﺆﺩﻱ ﺇﺯﺍﻟﺔ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ ﺇﻟﻰ ﺇﺯﺍﻟﺔ ﻋﺪﺩ ﻛﺒﻴﺮ ﻣﻦ ﺍﻟﺴﺠﻼﺕ‬ ‫ﺃ‬ ‫ﻣﻦﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺨﺎﺻﺔ ﺑﻚ ﻭﻫﻮ ﺃﻣﺮ ﻏﻴﺮ ﻣﺮﻏﻮﺏ ﻓﻴﻪ ﻓﻲ ﺑﻌﺾ‬ ‫ﺍﻟﺤﺎﻻﺕ‪.‬ﻧﺴﺘﺨﺪﻡ ﺍﻟﺤﺪ ﺍﻷﻗﺼﻰ ﻻﺳﺘﺒﺪﺍﻝ ﺍﻟﻘﻴﻢ ﺍﻟﻤﺘﻄﺮﻓﺔ ﺑﻘﻴﻢ ﻗﺼﻮﻯ ﺃﻭ‬ ‫ﺩﻧﻴﺎﻣﺤﺪﻭﺩﺓ‪.‬ﻳﻤﻜﻨﻨﺎ ﺍﺳﺘﺨﺪﺍﻡ ﺍﻟﺤﺪ ﺍﻷﻗﺼﻰ ﺍﻟﻤﺉﻮﻱ‪.‬ﺍﻟﻘﻴﻢ > ﺍﻟﻘﻴﻤﺔ ﻋﻨﺪ‬ ‫‪1‬ﺷﺎﺭﻉ‬ ‫ﻳﺘﻢﺍﺳﺘﺒﺪﺍﻝ ﺍﻟﻨﺴﺒﺔ ﺍﻟﻤﺉﻮﻳﺔ ﺑﺎﻟﻘﻴﻤﺔ ﻋﻨﺪ ‪1‬ﺷﺎﺭﻉﺍﻟﻨﺴﺒﺔ ﺍﻟﻤﺉﻮﻳﺔ‪ ،‬ﻭﺍﻟﻘﻴﻢ‬ ‫< ﻣﻦ ﺍﻟﻘﻴﻤﺔ ﻋﻨﺪ ‪99‬ﺫﻳﺘﻢ ﺍﺳﺘﺒﺪﺍﻝ ﺍﻟﻨﺴﺒﺔ ﺍﻟﻤﺉﻮﻳﺔ ﺑﺎﻟﻘﻴﻤﺔ ﺍﻟﻤﻮﺟﻮﺩﺓ‬ ‫ﻋﻨﺪ‪99‬ﺫﺍﻟﻨﺴﺒﺔ ﺍﻟﻤﺉﻮﻳﺔ‪.‬ﺍﻟﺤﺪ ﺍﻷﻗﺼﻰ ﻋﻨﺪ ‪5‬ﺫﻭ ‪95‬ﺫ‬ ‫ﺍﻟﻨﺴﺒﺔﺍﻟﻤﺉﻮﻳﺔ ﺷﺎﺉﻌﺔ ﺃﻳﻀﺎً‪.‬‬ ‫‪32‬‬ ‫‪ (2‬ﺍﻟﻘﻴﻢ ﺍﻟﻤﻔﻘﻮﺩﺓ‬ ‫ﻓﻲﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‪ ،‬ﻗﺪ ﻳﻜﻮﻥ ﻟﻌﻨﺼﺮ ﺑﻴﺎﻧﺎﺕ ﻭﺍﺣﺪ ﺃﻭ ﺃﻛﺜﺮ ﻗﻴﻢ‬ ‫ﻥ‬ ‫ﻣﻔﻘﻮﺩﺓﻓﻲ ﺳﺠﻼﺕ ﻣﺘﻌﺪﺩﺓ‪.‬‬ ‫ﺗﻮﺟﺪﺍﺳﺘﺮﺍﺗﻴﺠﻴﺎﺕ ﻣﺘﻌﺪﺩﺓ ﻟﻠﺘﻌﺎﻣﻞ ﻣﻊ ﺍﻟﻘﻴﻢ ﺍﻟﻤﻔﻘﻮﺩﺓ ﻟﻌﻨﺎﺻﺮ‬ ‫ﻥ‬ ‫ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬ﻭﻣﻦ ﺑﻴﻦ ﻫﺬﻩ ﺍﻻﺳﺘﺮﺍﺗﻴﺠﻴﺎﺕ‪:‬‬ ‫ﺃ ﺇﺯﺍﻟﺔﺍﻟﺴﺠﻼﺕ ﺍﻟﺘﻲ ﺗﺤﺘﻮﻱ ﻋﻠﻰ ﻗﻴﻤﺔ ﻣﻔﻘﻮﺩﺓ‬ ‫ﺃ ﺇﺩﺧﺎﻝﺍﻟﺴﺠﻼﺕ ﺍﻟﺘﻲ ﺗﺤﺘﻮﻱ ﻋﻠﻰ ﻗﻴﻤﺔ ﻣﻔﻘﻮﺩﺓ‪ :‬ﺃﻳﺘﻢ ﺍﺣﺘﺴﺎﺏ‬ ‫ﺟﻤﻴﻊﺍﻟﻘﻴﻢ ﺍﻟﻤﻔﻘﻮﺩﺓ ﺑﺎﺳﺘﺨﺪﺍﻡ ﺍﻟﻤﺘﻮﺳﻂ ﺃﻭ ﺍﻟﻮﺳﻴﻂ ﺃﻭ ﺍﻟﻤﻨﻮﺍﻝ )‬ ‫ﻗﺪﺭﺍﻹﻣﻜﺎﻥ( ﻟﻠﻘﻴﻢ ﺍﻟﻤﺘﺒﻘﻴﺔ ﻣﻦ ﻧﻔﺲ ﺍﻟﺴﻤﺔ‬ ‫ﺗﻘﺪﻳﺮﺍﻟﻘﻴﻢ ﺍﻟﻤﻔﻘﻮﺩﺓ‪:‬ﺇﺫﺍ ﻛﺎﻧﺖ ﻫﻨﺎﻙ ﺳﺠﻼﺕ ﻣﺸﺎﺑﻬﺔ ﻟﺘﻠﻚ ﺍﻟﺘﻲ‬ ‫ﺃ‬ ‫ﺗﺤﺘﻮﻱﻋﻠﻰ ﻗﻴﻢ ﻣﻔﻘﻮﺩﺓ‪ ،‬ﻓﻴﻤﻜﻦ ﺯﺭﻉ ﻗﻴﻢ ﺍﻟﺴﻤﺎﺕ ﻣﻦ ﺗﻠﻚ ﺍﻟﺴﺠﻼﺕ‬ ‫ﺍﻟﻤﺸﺎﺑﻬﺔﺑﺪﻻ ًﻣﻦ ﺍﻟﻘﻴﻤﺔ ﺍﻟﻤﻔﻘﻮﺩﺓ‪.‬ﻋﻠﻰ ﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺇﺫﺍ ﻛﺎﻥ ﻭﺯﻥ‬ ‫ﻃﺎﻟﺐﺭﻭﺳﻲ ﻳﺒﻠﻎ ﻣﻦ ﺍﻟﻌﻤﺮ ‪ 12‬ﻋﺎﻣﺎً ﻭﻳﺒﻠﻎ ﻃﻮﻟﻪ ‪ 5‬ﺃﻗﺪﺍﻡ ﻣﻔﻘﻮﺩﺍً‪ ،‬ﻓﻴﻤﻜﻦ‬ ‫ﺍﺳﺘﺨﺪﺍﻡﻭﺯﻥ ﺃﻱ ﻃﺎﻟﺐ ﺭﻭﺳﻲ ﺁﺧﺮ ﻳﺒﻠﻎ ﻣﻦ ﺍﻟﻌﻤﺮ ‪ 12‬ﻋﺎﻣﺎً ﺗﻘﺮﻳﺒﺎً ﻭﻳﺒﻠﻎ‬ ‫ﻃﻮﻟﻪ‪ 5‬ﺃﻗﺪﺍﻡ ﺗﻘﺮﻳﺒﺎً‪.‬‬ ‫‪33‬‬ ‫ﺳﻤﺎﺕ‬ ‫ﻥﻣﺎ ﻫﻲ ﺍﻟﻤﻴﺰﺓ؟‬ ‫ﺍﻟﻤﻴﺰﺓﻫﻲ ﺳﻤﺔ ﻟﻤﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺗﺴُﺘﺨﺪﻡ ﻓﻲ ﻋﻤﻠﻴﺔ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ‪.‬‬ ‫ﺿﻊﻓﻲ ﺍﻋﺘﺒﺎﺭﻙ ﻣﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ‪:Iris‬‬ ‫‪34‬‬ ‫ﻫﻨﺪﺳﺔﺍﻟﻤﻤﻴﺰﺍﺕ‬ ‫ﻥﻣﺎ ﻫﻲ ﻫﻨﺪﺳﺔ ﺍﻟﻤﻴﺰﺍﺕ؟‬ ‫ﻫﻨﺪﺳﺔﺍﻟﻤﻴﺰﺍﺕ ﻫﻲ ﺧﻄﻮﺓ ﻣﻬﻤﺔ ﻓﻲ ﺍﻟﻤﻌﺎﻟﺠﺔ ﺍﻟﻤﺴﺒﻘﺔ ﻟﻠﺘﻌﻠﻢ ﺍﻵﻟﻲ‪.‬‬ ‫ﻫﻨﺪﺳﺔﺍﻟﻤﻴﺰﺍﺕ ﻫﻲ ﻋﻤﻠﻴﺔ ﻣﻌﺎﻟﺠﺔ ﻣﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﻟﺘﺸﻜﻴﻞ ﻣﻴﺰﺍﺕ ﺗﻤﺜﻞ‬ ‫ﻣﺠﻤﻮﻋﺔﺍﻟﺒﻴﺎﻧﺎﺕ ﺑﺸﻜﻞ ﺃﻛﺜﺮ ﻓﻌﺎﻟﻴﺔ ﻭﺗﺆﺩﻱ ﺇﻟﻰ ﺃﺩﺍء ﺗﻌﻠﻴﻤﻲ ﺃﻓﻀﻞ‪.‬‬ ‫ﻥﺗﺘﻜﻮﻥ ﻫﻨﺪﺳﺔ ﺍﻟﻤﻴﺰﺍﺕ ﻣﻦ ﻋﻨﺼﺮﻳﻦ ﺭﺉﻴﺴﻴﻴﻦ‪:‬‬ ‫ﺗﺤﻮﻳﻞﺍﻟﻤﻴﺰﺓ‬ ‫ﺃ‬ ‫ﺍﺧﺘﻴﺎﺭﺍﻟﻤﻴﺰﺓ‬ ‫ﺃ‬ ‫‪35‬‬ ‫ﺗﺤﻮﻳﻞﺍﻟﻤﻴﺰﺓ‬ ‫ﺗﺤﻮﻳﻞﺍﻟﻤﻴﺰﺍﺕ ﻫﻲ ﻋﻤﻠﻴﺔ ﺇﻧﺸﺎء ﻣﻴﺰﺍﺕ ﺟﺪﻳﺪﺓ ﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻮﺟﻮﺩﺓ‪.‬‬ ‫ﻫﻨﺎﻙﻧﻮﻋﺎﻥ ﻣﻦ ﺗﺤﻮﻳﻞ ﺍﻟﻤﻴﺰﺍﺕ‪:‬‬ ‫ﻥ‬ ‫ﺑﻨﺎءﺍﻟﻤﻴﺰﺓ‬ ‫ﺃ‬ ‫ﺍﺳﺘﺨﺮﺍﺝﺍﻟﻤﻴﺰﺍﺕ‬ ‫ﺃ‬ ‫ﺑﻨﺎءﺍﻟﻤﻴﺰﺓﻫﻲ ﻋﻤﻠﻴﺔ ﺇﻧﺸﺎء ﻣﻴﺰﺍﺕ ﺟﺪﻳﺪﺓ ﺇﺿﺎﻓﻴﺔ ﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻮﺟﻮﺩﺓﻣﻦ ﺧﻼﻝ ﺍﻛﺘﺸﺎﻑ ﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﻤﻔﻘﻮﺩﺓ ﺣﻮﻝ ﺍﻟﻌﻼﻗﺎﺕ ﺑﻴﻦ‬ ‫ﺍﻟﻤﻴﺰﺍﺕ‪.‬ﻭﻣﻦ ﺛﻢ ﻓﺈﻥ ﺇﻧﺸﺎء ﺍﻟﻤﻴﺰﺍﺕ ﻳﻌﻤﻞ ﻋﻠﻰ ﺗﻮﺳﻴﻊ ﻣﺴﺎﺣﺔ‬ ‫ﺍﻟﻤﻴﺰﺍﺕ‪.‬ﻋﻠﻰ ﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺇﺫﺍ ﻛﺎﻥ ﻫﻨﺎﻙ 'ﻥ'ﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﻤﻮﺟﻮﺩﺓ ﻓﻲ‬ ‫ﻣﺠﻤﻮﻋﺔﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺨﺎﻡ‪ ،‬ﺑﻌﺪ ﺇﻧﺸﺎء ﺍﻟﻤﻴﺰﺍﺕ'ﻡ"ﻗﺪ ﺗﺘﻢ ﺇﺿﺎﻓﺔ ﺍﻟﻤﺰﻳﺪ ﻣﻦ‬ ‫ﺍﻟﻤﻴﺰﺍﺕ‪.‬ﻟﺬﺍ ﻓﻲ ﺍﻟﻨﻬﺎﻳﺔ‪ ،‬ﺳﺘﺼﺒﺢ ﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ""ﻥ‪+‬ﻡ' ﺳﻤﺎﺕ‪.‬‬ ‫‪36‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺗﺤﺘﻮﻱ ﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺘﺎﻟﻴﺔ ﻋﻠﻰ ﺛﻼﺙ‬ ‫ﻥ‬ ‫ﺳﻤﺎﺕ‪:‬ﻃﻮﻝ ﺍﻟﺸﻘﺔ ﻭﻋﺮﺿﻬﺎ ﻭﺳﻌﺮﻫﺎ‪.‬ﻭﺇﺫﺍ ﺗﻢ ﺍﺳﺘﺨﺪﺍﻣﻬﺎ ﻛﻤﺪﺧﻞ‬ ‫ﻟﻤﺸﻜﻠﺔﺍﻻﻧﺤﺪﺍﺭ‪ ،‬ﻓﻴﻤﻜﻦ ﺍﺳﺘﺨﺪﺍﻡ ﻫﺬﻩ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻟﺘﺪﺭﻳﺐ ﻧﻤﻮﺫﺝ‬ ‫ﺍﻻﻧﺤﺪﺍﺭ‪.‬ﻭﻣﻦ ﺍﻷﻧﺴﺐ ﻭﺍﻷﻛﺜﺮ ﻣﻨﻄﻘﻴﺔ ﺍﺳﺘﺨﺪﺍﻡ ﻣﺴﺎﺣﺔ ﺍﻟﺸﻘﺔ‪ ،‬ﻭﻫﻲ‬ ‫ﻟﻴﺴﺖﺳﻤﺔ ﻣﻮﺟﻮﺩﺓ ﻓﻲ ﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬ﻟﺬﺍ‬ ‫‪37‬‬ ‫ﺑﻨﺎءﺍﻟﻤﻴﺰﺓ‬ ‫ﺍﺳﺘﺨﺮﺍﺝﺍﻟﻤﻴﺰﺍﺕﻫﻲ ﻋﻤﻠﻴﺔ ﺇﻧﺸﺎء ﻣﻴﺰﺍﺕ ﺟﺪﻳﺪﺓ ﺇﺿﺎﻓﻴﺔ ﻣﻦ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻴﺰﺍﺕﺍﻟﻤﻮﺟﻮﺩﺓ ﺑﺎﺳﺘﺨﺪﺍﻡ ﺑﻌﺾ‬ ‫ﻣﺜﺎﻝﺁﺧﺮ ﻫﻮ ﺗﺤﻮﻳﻞ ﺍﻟﻠﻮﻏﺎﺭﻳﺘﻢ‪.‬ﻳﺘﻢ ﺍﺳﺘﺨﺪﺍﻣﻪ ﻟﺘﺤﻮﻳﻞ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻏﻴﺮ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻮﺯﻋﺔﺑﺸﻜﻞ ﻃﺒﻴﻌﻲ ﺇﻟﻰ ﺍﻟﺘﻮﺯﻳﻊ ﺍﻟﻄﺒﻴﻌﻲ‪.‬‬ ‫(')‪! = log‬‬ ‫‪38‬‬ ‫ﻫﻨﺎﻙﻣﻮﺍﻗﻒ ﻣﻌﻴﻨﺔ ﺣﻴﺚ ﻳﻜﻮﻥ ﺇﻧﺸﺎء ﺍﻟﻤﻴﺰﺓ ﻧﺸﺎﻃﺎً ﺃﺳﺎﺳﻴﺎً ﻗﺒﻞ ﺃﻥ‬ ‫ﻥ‬ ‫ﻧﺘﻤﻜﻦﻣﻦ ﺍﻟﺒﺪء ﻓﻲ ﻣﻬﻤﺔ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ‪:‬‬ ‫ﻋﻨﺪﻣﺎﺗﻜﻮﻥ ﻟﻠﻤﻴﺰﺍﺕ ﻗﻴﻤﺔ ﻧﻮﻋﻴﺔ ﻭﻳﺤﺘﺎﺝ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ ﺇﻟﻰ ﻗﻴﻤﺔ ﻛﻤﻴﺔ‬ ‫ﺃ‬ ‫ﻋﻨﺪﻣﺎﺗﺤﺘﻮﻱ ﺍﻟﻤﻴﺰﺍﺕ ﻋﻠﻰ ﻗﻴﻢ ﺭﻗﻤﻴﺔ )ﻣﺴﺘﻤﺮﺓ( ﻭﻳﺤﺘﺎﺝ ﺍﻟﺘﻌﻠﻢ‬ ‫ﺃ‬ ‫ﺍﻵﻟﻲﺇﻟﻰ ﻗﻴﻢ ﺗﺮﺗﻴﺒﻴﺔ‬ ‫ﻋﻨﺪﺍﻟﺘﻌﺎﻣﻞ ﻣﻊ ﻣﻴﺰﺓ ﻧﺼﻴﺔ ﻣﺤﺪﺩﺓ‬ ‫ﺃ‬ ‫‪39‬‬ ‫ﺗﺮﻣﻴﺰﺍﻟﻤﻴﺰﺍﺕ ﺍﻻﺳﻤﻴﺔ‬ ‫ﻥﺿﻊ ﻓﻲ ﺍﻋﺘﺒﺎﺭﻙ ﻣﺠﻤﻮﻋﺔ ﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺘﺎﻟﻴﺔ‬ ‫ﺍﺳﺘﻔﺴﺎﺭﺍﺕ‬ ‫ﻭ‬ ‫ﻥ‬ ‫ﻭﻫﻨﺎﻙ‬ ‫ﻥ‬ ‫ﻭ‬ ‫ﺃ‬ ‫ﻻﻳﻜﻮﻥ‬ ‫ﺝ‬ ‫ﺍﻧﺖ‬ ‫‪40‬‬ ‫ﻟﺤﻞﻫﺬﻩ ﺍﻟﻤﺸﻜﻠﺔ‪ ،‬ﻳﻤﻜﻦ ﺍﺳﺘﺨﺪﺍﻡ ﺑﻨﺎء ﺍﻟﻤﻴﺰﺍﺕ ﻹﻧﺸﺎء ﻣﻴﺰﺍﺕ‬ ‫ﻥ‬ ‫ﻭﻫﻤﻴﺔﺟﺪﻳﺪﺓ ﻳﻤﻜﻦ ﺍﺳﺘﺨﺪﺍﻣﻬﺎ ﺑﻮﺍﺳﻄﺔ ﺧﻮﺍﺭﺯﻣﻴﺎﺕ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ‪.‬‬ ‫ﻧﻈﺮﺍًﻷﻥ ﻣﻴﺰﺓ "ﺍﻟﻤﺪﻳﻨﺔ ﺍﻷﺻﻠﻴﺔ" ﻟﻬﺎ ﺛﻼﺙ ﻗﻴﻢ ﻓﺮﻳﺪﺓ ﻭﻫﻲ‪:‬ﻣﺪﻳﻨﺔ ﺃ‪,‬‬ ‫ﻥ‬ ‫ﻣﺪﻳﻨﺔﺏ‪ ،‬ﻭﻣﺪﻳﻨﺔ ﺝ‪ ،‬ﺛﻼﺙ ﻣﻴﺰﺍﺕ ﻭﻫﻤﻴﺔ ﻭﻫﻲ‪:‬ﺍﻟﻤﻨﺸﺄ_ﺍﻟﻤﺪﻳﻨﺔ_ﺃ‪,‬‬ ‫ﺍﻟﻤﺪﻳﻨﺔﺍﻷﺻﻠﻴﺔ ﺏ‪ ،‬ﻭﺃﺻﻞ_ﺍﻟﻤﺪﻳﻨﺔ_ﺝﺗﻢ ﺇﻧﺸﺎء ﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﻮﻫﻤﻴﺔ‪.‬ﻟﻬﺎ‬ ‫ﻗﻴﻤﺔ‪ 0‬ﺃﻭ‪ 1‬ﺑﻨﺎء ًﻋﻠﻰ ﺍﻟﻘﻴﻤﺔ ﺍﻟﻨﻮﻋﻴﺔ ﻟﻠﻤﻴﺰﺓ ﺍﻷﺻﻠﻴﺔ ﻓﻲ ﻫﺬﺍ ﺍﻟﺼﻒ‪.‬‬ ‫ﻭﺑﻨﻔﺲﺍﻟﻄﺮﻳﻘﺔ‪ ،‬ﻣﻴﺰﺍﺕ ﻭﻫﻤﻴﺔﻭﺍﻟﺪﺍ_ﺭﻳﺎﺿﻲ_‪Y‬ﻭ ﻭﺍﻟﺪﻳﻦ_ﺭﻳﺎﺿﻴﻴﻦ_ﻥﺗﻢ‬ ‫ﻥ‬ ‫ﺇﻧﺸﺎﺅﻫﺎﻟﻠﻤﻴﺰﺓ 'ﺍﻵﺑﺎء ﺍﻟﺮﻳﺎﺿﻴﻴﻦ' ﻭﻓﺮﺻﺔ ﺍﻟﻔﻮﺯ_‪Y‬ﻭﻓﺮﺻﺔ ﺍﻟﻔﻮﺯ_ﻥﺗﻢ‬ ‫ﺇﻧﺸﺎﺅﻫﺎﻟﻠﻤﻴﺰﺓ 'ﻓﺮﺻﺔ ﺍﻟﻔﻮﺯ"‪.‬‬ ‫‪41‬‬ ‫ﻥ‬ ‫‪2‬‬ ‫ﻝ‬ ‫ﻥ‬ ‫ﺍﻭﻩ‬ ‫‪43‬‬ ‫ﺗﺮﻣﻴﺰﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺘﺮﺗﻴﺒﻴﺔ‬ ‫ﻟﻨﺄﺧﺬﻣﺜﺎﻻ ًﻟﻤﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻄﻼﺏ‪.‬ﻟﻨﻔﺘﺮﺽ ﺃﻥ ﻫﻨﺎﻙ ﺛﻼﺛﺔ ﻣﺘﻐﻴﺮﺍﺕ‪:‬‬ ‫ﻥ‬ ‫‪،science_marks‬‬ ‫ﻋﻼﻣﺎﺕﺍﻟﺮﻳﺎﺿﻴﺎﺕ ﻭﺍﻟﺪﺭﺟﺔ ﻛﻤﺎ ﻫﻮ ﻣﻮﺿﺢ ﺃﺩﻧﺎﻩ‬ ‫ﻧﺮﻯﺃﻥ ﺍﻟﺪﺭﺟﺔ ﻫﻲ ﺳﻤﺔ ﺗﺮﺗﻴﺒﻴﺔ ﺑﻘﻴﻢ ‪ A‬ﻭ‪ B‬ﻭ‪ C‬ﻭ‪.D‬ﻟﺘﺤﻮﻳﻞ ﻫﺬﻩ ﺍﻟﺴﻤﺔ‬ ‫ﻥ‬ ‫ﺇﻟﻰﺳﻤﺔ ﺭﻗﻤﻴﺔ‪ ،‬ﻧﻘﻮﻡ ﺑﺈﻧﺸﺎء ﺳﻤﺔ ﺟﺪﻳﺪﺓ ‪ num_grade‬ﻟﺘﻌﻴﻴﻦ ﻗﻴﻤﺔ‬ ‫ﺭﻗﻤﻴﺔﻣﻘﺎﺑﻞ ﻛﻞ ﻗﻴﻤﺔ ﺗﺮﺗﻴﺒﻴﺔ‪.‬ﻳﺘﻢ ﺗﻌﻴﻴﻦ ﺍﻟﺪﺭﺟﺎﺕ ‪ A‬ﻭ‪ B‬ﻭ‪ C‬ﻭ‪ D‬ﻓﻲ‬ ‫ﺇﻟﻰﺍﻟﻘﻴﻢ ‪ 1‬ﻭ‪ 2‬ﻭ‪ 3‬ﻭ‪ 4‬ﻓﻲ ﺍﻟﺴﻤﺔ ﺍﻟﺠﺪﻳﺪﺓ ﺍﻟﻤﻮﺿﺤﺔ ﻓﻲ ﺍﻟﺸﻜﻞ ﺍﻟﺘﺎﻟﻲ‪.‬‬ ‫‪45‬‬ ‫ﺗﺤﻮﻳﻞﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺮﻗﻤﻴﺔ )ﺍﻟﻤﺴﺘﻤﺮﺓ( ﺇﻟﻰ‬ ‫ﻣﻴﺰﺍﺕﺗﺼﻨﻴﻔﻴﺔ‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﻗﺪ ﻧﺮﻏﺐ ﻓﻲ ﻣﻌﺎﻟﺠﺔ ﻣﺸﻜﻠﺔ ﺍﻟﺘﻨﺒﺆ ﺑﺄﺳﻌﺎﺭ‬ ‫ﻥ‬ ‫ﺍﻟﻌﻘﺎﺭﺍﺕ‪،‬ﻭﻫﻲ ﻣﺸﻜﻠﺔ ﺍﻧﺤﺪﺍﺭ‪ ،‬ﺑﺎﻋﺘﺒﺎﺭﻫﺎ ﻣﺸﻜﻠﺔ ﺗﻨﺒﺆ ﺑﻔﺉﺔ ﺃﺳﻌﺎﺭ‬ ‫ﺍﻟﻌﻘﺎﺭﺍﺕ‪،‬ﻭﻫﻲ ﻣﺸﻜﻠﺔ ﺗﺼﻨﻴﻒ‪.‬ﻓﻲ ﻫﺬﻩ ﺍﻟﺤﺎﻟﺔ‪ ،‬ﻳﻤﻜﻨﻨﺎ‬ ‫ﺗﺠﻤﻴﻊﺍﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺮﻗﻤﻴﺔ ﻓﻲ ﻓﺉﺎﺕ ﻣﺘﻌﺪﺩﺓ ﺍﺳﺘﻨﺎﺩﺍً ﺇﻟﻰ‬ ‫ﻧﻄﺎﻕﺍﻟﺒﻴﺎﻧﺎﺕ‪.‬‬ ‫‪46‬‬ ‫ﺩ‬ ‫‪47‬‬ ‫ﺗﺮﻣﻴﺰﻣﻴﺰﺓ ﺍﻟﻨﺺ‬ ‫ﻓﻲﺍﻟﻮﻗﺖ ﺍﻟﺤﺎﺿﺮ‪ ،‬ﻳﻌﺪ ﺍﻟﻨﺺ ﺍﻟﻮﺳﻴﻠﺔ ﺍﻷﻛﺜﺮ ﺍﻧﺘﺸﺎﺭﺍً ﻟﻠﺘﻮﺍﺻﻞ ﺳﻮﺍء‬ ‫ﻥ‬ ‫ﻛﻨﺎﻧﻔﻜﺮ ﻓﻲ ﺧﺪﻣﺎﺕ ﺍﻟﺒﺮﻳﺪ ﺍﻹﻟﻜﺘﺮﻭﻧﻲ ﺃﻭ ﻓﻴﺴﺒﻮﻙ ﺃﻭ ﺗﻮﻳﺘﺮ ﺃﻭ ﻭﺍﺗﺴﺎﺏ‪.‬‬ ‫ﺣﻘﻴﺒﺔﺍﻟﻜﻠﻤﺎﺕ )‪ (BoW‬ﻫﻲ ﻃﺮﻳﻘﺔ ﺑﻨﺎء ﻣﻴﺰﺍﺕ ﻟﺘﺤﻮﻳﻞ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻨﺺ‬ ‫ﻥ‬ ‫ﺇﻟﻰﺗﻤﺜﻴﻞ ﺭﻗﻤﻲ‪.‬‬ ‫ﻥﺧﻄﻮﺍﺕ ﺑﻨﺎء ﺣﻘﻴﺒﺔ ﺍﻟﻜﻠﻤﺎﺕ‬ ‫ﻳﺘﻢﺗﻘﺴﻴﻢ ﺍﻟﻨﺺ ﺇﻟﻰ ﺃﺟﺰﺍء )ﻓﺼﻞ ﺍﻟﻜﻠﻤﺎﺕ ﺑﺎﺳﺘﺨﺪﺍﻡ ﺍﻟﻤﺴﺎﻓﺎﺕ‬ ‫ﺃ‬ ‫ﺍﻟﻔﺎﺭﻏﺔﻭﻋﻼﻣﺎﺕ ﺍﻟﺘﺮﻗﻴﻢ(‪.‬‬ ‫ﺣﺪﺩﺍﻟﻜﻠﻤﺎﺕ ﺍﻟﻔﺮﻳﺪﺓ ﻟﺒﻨﺎء ﺍﻟﻤﻔﺮﺩﺍﺕ‬ ‫ﺃ‬ ‫ﺛﻢﻳﺘﻢ ﺣﺴﺎﺏ ﻋﺪﺩ ﻣﺮﺍﺕ ﻇﻬﻮﺭ ﻛﻞ ﺭﻣﺰ‬ ‫ﺃ‬ ‫‪48‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺿﻊ ﻓﻲ ﺍﻋﺘﺒﺎﺭﻙ ﺍﻟﻨﺺ ]"ﻫﺬﺍ ﻫﺎﺗﻒ ﺟﻴﺪ‪ ،‬ﻫﺬﺍ‬ ‫ﻥ‬ ‫ﻫﺎﺗﻒﺳﻲء‪ ،‬ﺇﻧﻬﺎ ﻗﻄﺔ ﺟﻴﺪﺓ‪ ،‬ﻟﺪﻳﻪ ﻣﺰﺍﺝ ﺳﻴﺊ‪ ،‬ﻫﺬﺍ ﺍﻟﻬﺎﺗﻒ ﺍﻟﻤﺤﻤﻮﻝ‬ ‫ﻟﻴﺲﺟﻴﺪﺍً"[‬ ‫ﻳﺘﻢﺗﻤﻴﻴﺰ ﺍﻟﻨﺺ ﻋﻠﻰ ﺃﻧﻪ‬ ‫ﺃ‬ ‫]"ﻫﺬﺍ"‪" ،‬ﻫﻮ"‪" ،‬ﺃ"‪" ،‬ﺟﻴﺪ"‪" ،‬ﻫﺎﺗﻒ"‪" ،‬ﻫﺬﺍ"‪" ،‬ﻫﻮ"‪" ،‬ﺃ"‪" ،‬ﺳﻲء"‪" ،‬ﻣﺤﻤﻮﻝ"‬ ‫‪"،‬ﻫﻲ"‪" ،‬ﻫﻲ"‪" ،‬ﺃ"‪" ،‬ﺟﻴﺪ"‪" ،‬ﻗﻄﺔ"‪" ،‬ﻫﻮ"‪" ،‬ﻟﺪﻳﻪ"‪" ،‬ﺃ"‪" ،‬ﺳﻲء"‪" ،‬ﻣﺰﺍﺝ"‪" ،‬‬ ‫ﻫﺬﺍ"‪" ،‬ﻣﺤﻤﻮﻝ"‪" ،‬ﻫﺎﺗﻒ"‪" ،‬ﻫﻮ"‪" ،‬ﻟﻴﺲ"‪" ،‬ﺟﻴﺪ"[‬ ‫ﺣﺪﺩﺍﻟﻜﻠﻤﺎﺕ ﺍﻟﻔﺮﻳﺪﺓ ﻟﺒﻨﺎء ﺍﻟﻤﻔﺮﺩﺍﺕ‬ ‫ﺃ‬ ‫]"ﺃ"‪" ،‬ﺳﻴﺊ"‪" ،‬ﻗﻂ"‪" ،‬ﺟﻴﺪ"‪" ،‬ﻟﺪﻳﻪ"‪" ،‬ﻫﻮ"‪" ،‬ﻫﻮ"‪" ،‬ﻣﺤﻤﻮﻝ"‪" ،‬ﻟﻴﺲ‬ ‫"‪" ،‬ﻫﺎﺗﻒ"‪" ،‬ﻫﻲ"‪" ،‬ﻣﺰﺍﺝ"‪" ،‬ﻫﺬﺍ"[‬ ‫ﺍﻵﻥ‪،‬ﻳﺘﻢ ﺗﻤﺜﻴﻞ ﺍﻟﺠﻤﻠﺔ "ﻫﺬﺍ ﻫﺎﺗﻒ ﺟﻴﺪ‪ ،‬ﻫﺬﺍ ﻫﺎﺗﻒ ﻣﺤﻤﻮﻝ ﺟﻴﺪ"‬ ‫ﺃ‬ ‫ﺑﺎﺳﺘﺨﺪﺍﻡﺍﻟﻤﻔﺮﺩﺍﺕ ﻋﻠﻰ ﺍﻟﻨﺤﻮ ﺍﻟﺘﺎﻟﻲ‪:‬‬ ‫ﺃﺃﻭ ﺑﺒﺴﺎﻃﺔ )‪(2 ،0 ،0 ،1 ،0 ،1 ،2 ،0 ،0 ،2 ،0 ،0 ،2‬‬ ‫‪49‬‬ ‫ﺍﺧﺘﻴﺎﺭﺍﻟﻤﻴﺰﺓ‬ ‫ﺍﺧﺘﻴﺎﺭﺍﻟﻤﻴﺰﺓ ﻫﻲ ﻋﻤﻠﻴﺔ ﺍﺧﺘﻴﺎﺭ ﺍﻟﻤﻴﺰﺍﺕ ﺍﻷﻛﺜﺮ ﺃﻫﻤﻴﺔ ﻣﻦ ﻣﺠﻤﻮﻋﺔ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻴﺰﺍﺕﺍﻟﻤﻮﺟﻮﺩﺓ ﻟﻤﻬﻤﺔ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ ﻟﺪﻳﻨﺎ‪.‬‬ ‫‪50‬‬ ‫ﺍﻟﻘﻀﺎﻳﺎﺍﻟﻤﺘﻌﻠﻘﺔ ﺑﺎﻟﺒﻴﺎﻧﺎﺕ ﻋﺎﻟﻴﺔ ﺍﻷﺑﻌﺎﺩ‬ ‫ﺗﺸﻴﺮﻋﺒﺎﺭﺓ "ﺍﻟﺒﻴﺎﻧﺎﺕ ﻋﺎﻟﻴﺔ ﺍﻷﺑﻌﺎﺩ" ﺇﻟﻰ ﻣﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺗﺤﺘﻮﻱ‬ ‫ﻥ‬ ‫ﻋﻠﻰﻋﺪﺩ ﻛﺒﻴﺮ ﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ‪).‬ﺍﻟﺒﻌﺪ = ﻋﺪﺩ ﺍﻟﻤﻴﺰﺍﺕ(‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺗﺤﻠﻴﻞ ﺍﻟﺤﻤﺾ ﺍﻟﻨﻮﻭﻱ‪ ،‬ﻳﻤﻜﻦ ﻟﺒﻴﺎﻧﺎﺕ ﺍﻟﺤﻤﺾ ﺍﻟﻨﻮﻭﻱ ﺃﻥ ﺗﺤﺘﻮﻱ ﻋﻠﻰ‬ ‫ﻥ‬ ‫ﻣﺎﻳﺼﻞ ﺇﻟﻰ ‪ 450‬ﺃﻟﻒ ﺑﻌُﺪ )ﻣﺴﺒﺎﺭﺍﺕ ﺍﻟﺠﻴﻨﺎﺕ(‪.‬‬ ‫ﻭﻓﻲﺑﻴﺎﻧﺎﺕ ﻧﺼﻴﺔ ﻛﺒﻴﺮﺓ ﺃﻳﻀﺎً )ﻣﺜﻞ ﺍﻟﻜﺘﺐ(‪ ،‬ﻳﻤﻜﻦ ﺃﻥ ﻳﺼﻞ ﻋﺪﺩ‬ ‫ﻥ‬ ‫ﺍﻟﻜﻠﻤﺎﺕﺍﻟﻔﺮﻳﺪﺓ )ﺍﻟﺮﻣﻮﺯ( ﺍﻟﺘﻲ ﺗﻤﺜﻞ ﻣﻴﺰﺓ ﻣﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻨﺺ ﺇﻟﻰ‬ ‫ﻋﺸﺮﺍﺕﺍﻵﻻﻑ‪.‬‬ ‫ﻗﺪﺗﺸﻜﻞ ﻣﺜﻞ ﻫﺬﻩ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻋﺎﻟﻴﺔ ﺍﻷﺑﻌﺎﺩ ﺗﺤﺪﻳﺎً ﻛﺒﻴﺮﺍً ﻷﻱ ﺧﻮﺍﺭﺯﻣﻴﺔ‬ ‫ﻥ‬ ‫ﻟﻠﺘﻌﻠﻢﺍﻵﻟﻲ ﺑﺴﺒﺐ‪.‬‬ ‫ﺳﺘﻜﻮﻥﻫﻨﺎﻙ ﺣﺎﺟﺔ ﺇﻟﻰ ﻛﻤﻴﺔ ﻛﺒﻴﺮﺓ ﻣﻦ ﺍﻟﻤﻮﺍﺭﺩ ﺍﻟﺤﺴﺎﺑﻴﺔ ﻭﺳﺘﻜﻮﻥ‬ ‫ﻥ‬ ‫ﻫﻨﺎﻙﺣﺎﺟﺔ ﺇﻟﻰ ﻛﻤﻴﺔ ﻛﺒﻴﺮﺓ ﻣﻦ ﺍﻟﻮﻗﺖ ﺍﻟﺤﺴﺎﺑﻲ‪.‬‬ ‫ﻥ‬ ‫ﻳﻨﺨﻔﺾﺃﺩﺍء ﺍﻟﻨﻤﻮﺫﺝ ﺑﺸﻜﻞ ﺣﺎﺩ ﺑﺴﺒﺐ ﺍﻟﺒﻴﺎﻧﺎﺕ ﻏﻴﺮ ﺍﻟﻀﺮﻭﺭﻳﺔ‪.‬‬ ‫ﻥ‬ ‫‪51‬‬ ‫ﻋﻮﺍﻣﻞﻣﻬﻤﺔ ﻓﻲ ﺍﺧﺘﻴﺎﺭ ﺍﻟﻤﻴﺰﺓ‬ ‫ﻥﻫﻨﺎﻙ ﻋﺎﻣﻼﻥ ﻳﺆﺛﺮﺍﻥ ﻋﻠﻰ ﺍﺧﺘﻴﺎﺭ ﺍﻟﻤﻴﺰﺓ‪:‬‬ ‫‪ (1‬ﺃﻫﻤﻴﺔ ﺍﻟﻤﻴﺰﺓ ‪ (2‬ﺍﻟﺘﻜﺮﺍﺭ ﻓﻲ ﺍﻟﻤﻴﺰﺓ‬ ‫ﺃﻫﻤﻴﺔﺍﻟﻤﻴﺰﺓ‬ ‫ﻗﺪﺗﺴﺎﻫﻢ ﺍﻟﻤﻴﺰﺓ ﺑﻤﻌﻠﻮﻣﺎﺕ ﻏﻴﺮ ﺫﺍﺕ ﺻﻠﺔ ﻓﻲ ﺳﻴﺎﻕ ﻣﻬﻤﺔ ﺍﻟﺘﻌﻠﻢ‬ ‫ﻥ‬ ‫ﺍﻵﻟﻲ‪.‬‬ ‫ﺧﺬﻣﺜﺎﻻ ًﺑﺴﻴﻄﺎً ﻟﻤﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻄﻼﺏ‪.‬ﻓﻲ ﺳﻴﺎﻕ ﺍﻟﻤﻬﻤﺔ ﺍﻟﺨﺎﺿﻌﺔ‬ ‫ﻥ‬ ‫ﻟﻺﺷﺮﺍﻑﺍﻟﻤﺘﻤﺜﻠﺔ ﻓﻲ ﺍﻟﺘﻨﺒﺆ ﺑﺪﺭﺟﺎﺕ ﺍﻟﻄﻼﺏ ﺃﻭ ﺍﻟﻤﻬﻤﺔ ﻏﻴﺮ ﺍﻟﺨﺎﺿﻌﺔ‬ ‫ﻟﻺﺷﺮﺍﻑﺍﻟﻤﺘﻤﺜﻠﺔ ﻓﻲ ﺗﺠﻤﻴﻊ ﺍﻟﻄﻼﺏ ﺫﻭﻱ ﺍﻟﻘﺪﺭﺍﺕ ﺍﻷﻛﺎﺩﻳﻤﻴﺔ‬ ‫ﺍﻟﻤﺘﺸﺎﺑﻬﺔ‪،‬ﻓﺈﻥ ﺧﺎﺻﻴﺔ ﺭﻗﻢ ﺍﻟﻘﻴﺪ ﻏﻴﺮ ﺫﺍﺕ ﺻﻠﺔ ﺗﻤﺎﻣﺎً‪.‬‬ ‫ﺃﻳﺔﻣﻴﺰﺓ ﻏﻴﺮ ﺫﺍﺕ ﺻﻠﺔ ﻓﻲ ﺳﻴﺎﻕ ﻣﻬﻤﺔ ﺍﻟﺘﻌﻠﻢ ﺍﻵﻟﻲ ﻫﻲ ﻣﺮﺷﺤﺔ‬ ‫ﻥ‬ ‫ﻟﻠﺮﻓﺾﻋﻨﺪﻣﺎ ﻧﻘﻮﻡ ﺑﺎﺧﺘﻴﺎﺭ ﻣﺠﻤﻮﻋﺔ ﻓﺮﻋﻴﺔ ﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ‪.‬‬ ‫‪52‬‬ ‫ﻣﻴﺰﺓﺍﻟﺘﻜﺮﺍﺭ‬ ‫ﻗﺪﺗﺴﺎﻫﻢ ﺍﻟﻤﻴﺰﺓ ﺑﻤﻌﻠﻮﻣﺎﺕ ﻣﻤﺎﺛﻠﺔ ﻟﻠﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﺘﻲ ﺗﺴﺎﻫﻢ ﺑﻬﺎ‬ ‫ﻥ‬ ‫ﻣﻴﺰﺓﻭﺍﺣﺪﺓ ﺃﻭ ﺃﻛﺜﺮ ﺃﺧﺮﻯ‪.‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﻓﻲ ﻣﺸﻜﻠﺔ ﺍﻟﺘﻨﺒﺆ ﺑﺎﻟﻮﺯﻥ‪ ،‬ﺗﺴﺎﻫﻢ ﻛﻞ ﻣﻦ‬ ‫ﻥ‬ ‫ﺍﻟﺴﻤﺘﻴﻦ‪ Age‬ﻭ‪ High‬ﻓﻲ ﻣﻌﻠﻮﻣﺎﺕ ﻣﻤﺎﺛﻠﺔ‪.‬ﻷﻧﻪ ﻣﻊ ﺯﻳﺎﺩﺓ ‪ ،Age‬ﻣﻦ‬ ‫ﺍﻟﻤﺘﻮﻗﻊﺃﻥ ﻳﺰﻳﺪ ‪.Weight‬ﻭﺑﺎﻟﻤﺜﻞ‪ ،‬ﻣﻊ ﺯﻳﺎﺩﺓ ‪ Height‬ﺃﻳﻀﺎً ﻣﻦ‬ ‫ﺍﻟﻤﺘﻮﻗﻊﺃﻥ ﻳﺰﻳﺪ ‪.Weight‬ﺃﻳﻀﺎً‪ ،‬ﻳﺘﺰﺍﻳﺪ ‪ Age‬ﻭ‪ High‬ﻣﻊ ﺑﻌﻀﻬﻤﺎ‬ ‫ﺍﻟﺒﻌﺾ‪.‬ﻟﺬﺍ‪ ،‬ﻓﻲ ﺳﻴﺎﻕ ﻣﺸﻜﻠﺔ ﺍﻟﺘﻨﺒﺆ ﺑﺎﻟﻮﺯﻥ‪ ،‬ﻳﺴﺎﻫﻢ ‪ Age‬ﻭ‪High‬‬ ‫ﻓﻲﻣﻌﻠﻮﻣﺎﺕ ﻣﻤﺎﺛﻠﺔ‪.‬ﺑﻌﺒﺎﺭﺓ ﺃﺧﺮﻯ‪ ،‬ﺳﻮﺍء ﻛﺎﻧﺖ ﺍﻟﺴﻤﺔ ‪Height‬‬ ‫ﻣﻮﺟﻮﺩﺓﺃﻡ ﻻ ﻛﺠﺰء ﻣﻦ ﻣﺠﻤﻮﻋﺔ ﻓﺮﻋﻴﺔ ﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ‪ ،‬ﻓﺈﻥ ﻧﻤﻮﺫﺝ‬ ‫ﺍﻟﺘﻌﻠﻢﺳﻴﻌﻄﻲ ﻧﻔﺲ ﺍﻟﻨﺘﺎﺉﺞ ﺗﻘﺮﻳﺒﺎً‪.‬ﻧﻔﺲ ﺍﻟﺸﻲء ﺑﺎﻟﻨﺴﺒﺔ ﻟﺨﺎﺻﻴﺔ‬ ‫‪.Age‬‬ ‫‪53‬‬ ‫ﻓﻲﻫﺬﺍ ﺍﻟﻨﻮﻉ ﻣﻦ ﺍﻟﻤﻮﺍﻗﻒ ﻋﻨﺪﻣﺎ ﺗﻜﻮﻥ ﺇﺣﺪﻯ ﺍﻟﻤﻴﺰﺍﺕ ﻣﺸﺎﺑﻬﺔ‬ ‫ﻥ‬ ‫ﻟﻤﻴﺰﺍﺕﺃﺧﺮﻯ‪ ،‬ﻳﻘُﺎﻝ ﺇﻥ ﺍﻟﻤﻴﺰﺓ ﻫﻲﻣﻦ ﺍﻟﻤﺤﺘﻤﻞ ﺃﻥ ﺗﻜﻮﻥ ﺯﺍﺉﺪﺓ‬ ‫ﻋﻦﺍﻟﺤﺎﺟﺔﻓﻲ ﺳﻴﺎﻕ ﻣﺸﻜﻠﺔ ﺍﻟﺘﻌﻠﻢ‪.‬‬ ‫ﺟﻤﻴﻊﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺘﻲ ﻗﺪ ﺗﻜﻮﻥ ﺯﺍﺉﺪﺓ ﻋﻦ ﺍﻟﺤﺎﺟﺔ ﻫﻲ ﻣﺮﺷﺤﺔ ﻟﻠﺮﻓﺾ‬ ‫ﻥ‬ ‫ﻓﻲﺍﻟﻤﺠﻤﻮﻋﺔ ﺍﻟﻔﺮﻋﻴﺔ ﺍﻟﻨﻬﺎﺉﻴﺔ ﻟﻠﻤﻴﺰﺍﺕ‪.‬ﻳﺘﻢ ﺍﻟﻨﻈﺮ ﻓﻲ ﻋﺪﺩ ﺻﻐﻴﺮ‬ ‫ﻓﻘﻂﻣﻦ ﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺘﻤﺜﻴﻠﻴﺔ ﻣﻦ ﻣﺠﻤﻮﻋﺔ ﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺰﺍﺉﺪﺓ ﻋﻦ ﺍﻟﺤﺎﺟﺔ‬ ‫ﺍﻟﻤﺤﺘﻤﻠﺔﺑﺎﻋﺘﺒﺎﺭﻫﺎ ﺟﺰءﺍً ﻣﻦ ﺍﻟﻤﺠﻤﻮﻋﺔ ﺍﻟﻔﺮﻋﻴﺔ ﺍﻟﻨﻬﺎﺉﻴﺔ ﻟﻠﻤﻴﺰﺍﺕ‪.‬‬ ‫ﺑﺒﺴﺎﻃﺔﻳﻤﻜﻨﻨﺎ ﺍﻟﻘﻮﻝ ﺃﻥ ﺍﻟﻬﺪﻑ ﺍﻟﺮﺉﻴﺴﻲ ﻣﻦ ﺍﺧﺘﻴﺎﺭ ﺍﻟﻤﻴﺰﺍﺕ ﻫﻮ‬ ‫ﻥ‬ ‫ﺇﺯﺍﻟﺔﺟﻤﻴﻊ ﺍﻟﻤﻴﺰﺍﺕ ﻏﻴﺮ ﺫﺍﺕ ﺍﻟﺼﻠﺔ ﻭﺍﺧﺘﻴﺎﺭ ﻣﺠﻤﻮﻋﺔ ﻓﺮﻋﻴﺔ ﺗﻤﺜﻴﻠﻴﺔ‬ ‫ﻣﻦﺍﻟﻤﻴﺰﺍﺕ ﺍﻟﺘﻲ ﻗﺪ ﺗﻜﻮﻥ ﺯﺍﺉﺪﺓ ﻋﻦ ﺍﻟﺤﺎﺟﺔ‪.‬‬ ‫ﺍﻵﻥ‪،‬ﺍﻟﺴﺆﺍﻝ ﻫﻮ ﻛﻴﻔﻴﺔ ﻣﻌﺮﻓﺔ ﺃﻱ ﺍﻟﻤﻴﺰﺍﺕ ﻟﻴﺴﺖ ﺫﺍﺕ ﺻﻠﺔ ﺃﻭ ﺃﻱ‬ ‫ﻥ‬ ‫ﺍﻟﻤﻴﺰﺍﺕﻟﻬﺎ ﺇﻣﻜﺎﻧﻴﺔ ﺍﻟﺘﻜﺮﺍﺭ‪.‬‬ ‫‪54‬‬ ‫ﻣﻘﺎﻳﻴﺲﺃﻫﻤﻴﺔ ﺍﻟﻤﻴﺰﺓ‬ ‫ﺍﻟﻤﻌﻠﻮﻣﺎﺕﺍﻟﻤﺘﺒﺎﺩﻟﺔ )‪(MI‬ﻫﻮ ﻣﻘﻴﺎﺱ ﻟﻜﻤﻴﺔ ﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﺘﻲ‬ ‫ﻥ‬ ‫ﻳﻤﻜﻨﻨﺎﻣﻌﺮﻓﺘﻬﺎ ﻣﻦ ﻣﻴﺰﺓ ﻭﺍﺣﺪﺓ ﻣﻦ ﺧﻼﻝ ﻣﻼﺣﻈﺔ ﻗﻴﻢ ﺍﻟﻤﻴﺰﺓ ﺍﻷﺧﺮﻯ‪.‬‬ ‫ﺍﻟﻤﻌﻠﻮﻣﺎﺕﺍﻟﻤﺘﺒﺎﺩﻟﺔ )‪ (MI‬ﺑﻴﻦ ﻣﻴﺰﺗﻴﻦ ﻫﻲ ﻗﻴﻤﺔ ﻏﻴﺮ ﺳﺎﻟﺒﺔ‪ ،‬ﺗﻘﻴﺲ‬ ‫ﻥ‬ ‫ﻣﺪﻯﺍﻟﺼﻠﺔ ﺑﻴﻦ ﺍﻟﻤﻴﺰﺗﻴﻦ‪.‬ﻭﻫﻲ ﺗﺴﺎﻭﻱ ﺻﻔﺮﺍً ﺇﺫﺍ ﻭﻓﻘﻂ ﺇﺫﺍ ﻛﺎﻧﺖ‬ ‫ﺍﻟﻤﻴﺰﺗﺎﻥﻏﻴﺮ ﺫﻱ ﺻﻠﺔ‪ ،‬ﻭﻛﺎﻧﺖ ﺍﻟﻘﻴﻢ ﺍﻷﻋﻠﻰ ﺗﻌﻨﻲ ﺻﻠﺔ ﺃﻋﻠﻰ‪.‬‬ ‫ﻭﻣﻦﺛﻢ‪ ،‬ﺑﺎﻟﻨﺴﺒﺔ ﻟﻠﺘﻌﻠﻢ ﺍﻟﺨﺎﺿﻊ ﻟﻺﺷﺮﺍﻑ ﺣﻴﺚ ﻟﺪﻳﻨﺎ ﺗﺴﻤﻴﺔ ﻓﺉﺔ‪،‬‬ ‫ﻥ‬ ‫ﺗﻌﺘﺒﺮﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﻤﺘﺒﺎﺩﻟﺔ )‪ (MI‬ﺑﻤﺜﺎﺑﺔ ﻣﻘﻴﺎﺱ ﺟﻴﺪ ﻟﻠﻌﻼﻗﺔ ﺑﻴﻦ ﺃﻱ‬ ‫ﻣﻴﺰﺓﻭﺗﺴﻤﻴﺔ ﺍﻟﻔﺉﺔ‪.‬‬ ‫‪55‬‬ ‫ﻳﺘﻢﺗﻌﺮﻳﻒ ﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﻤﺘﺒﺎﺩﻟﺔ )‪ (MI‬ﺑﻴﻦ ﺍﻟﻤﻴﺰﺓ ! ﻭﻋﻼﻣﺔ ﺍﻟﻔﺉﺔ‬ ‫ﻥ‬ ‫" ﻋﻠﻰ ﺃﻧﻬﺎ‬ ‫‪(" ,!)/.-,+*(–(")/.-,+*( + (!)/.-,+*( = (" ,!)$ #‬‬ ‫ﺣﻴﺚﺗﻜﻮﻥ ﺇﻧﺘﺮﻭﺑﻴﺎ ﺍﻟﻤﻴﺰﺓ‪2‬ﺗﻢ ﺗﻌﺮﻳﻔﻪ ﺑﻮﺍﺳﻄﺔ‬ ‫( * ‪ ∗65 8 4 − = (2)/. - , +‬ﺳﺠﻞ‪((56)8)2‬‬ ‫‪2∋65‬‬ ‫( * ‪< ,2 /. - , +‬ﻫﻲ ﺍﻹﻧﺘﺮﻭﺑﻴﺎ ﺍﻟﻤﺸﺘﺮﻛﺔ ﻭﻳﺘﻢ ﺗﻌﺮﻳﻔﻬﺎ ﺑﻮﺍﺳﻄﺔ‬ ‫؟)‪/ ،65 8 4 4 –= (< ،2‬ﻑ∗ ﺳﺠﻞ‪/ ،65 8)2‬ﻑ(‬ ‫@ﺃ∋ﻕ‪.‬ﻡﺩ‪∋E‬‬ ‫ﻳﻤﻜﻨﻨﺎﺍﺳﺘﺨﺪﺍﻡ ﻭﻇﻴﻔﺔ ﺟﺎﻫﺰﺓ ﻟﻼﺳﺘﺨﺪﺍﻡ ﻣﻦﺳﻜﻴﺖ‪-‬ﺗﻌﻠﻢ‬ ‫ﻥ‬ ‫ﻣﺴُﻤَﻰًّ‪sklearn.feature_selection.mutual_info_classif‬‬ ‫ﺳﺄﺳﺘﺨﺪﻡﻣﺠﻤﻮﻋﺔ ﺑﻴﺎﻧﺎﺕ ﺳﺮﻃﺎﻥ ﺍﻟﺜﺪﻱ ﺍﻟﺘﻲ ﺗﺄﺗﻲ ﻣﻊ ﺳﻜﻴﺖ‪-‬ﺗﻌﻠﻢ‬ ‫ﻥ‬ ‫ﻟﺮﺅﻳﺔﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﻤﺘﺒﺎﺩﻟﺔ ﻟﺠﻤﻴﻊ ﺍﻟﻤﻴﺰﺍﺕ ﻓﻴﻤﺎ ﻳﺘﻌﻠﻖ ﺑﻌﻼﻣﺔ ﺍﻟﻔﺼﻞ‪.‬‬ ‫‪56‬‬ ‫ ﺗﺤﻤﻴﻞ ﺑﻴﺎﻧﺎﺕ ﺳﺮﻃﺎﻥ ﺍﻟﺜﺪﻱ‬.1‫ﺍﻟﺨﻄﻮﺓ‬ load_breast_cancer ‫ ﺍﺳﺘﻴﺮﺍﺩ‬sklearn.datasets‫ﻣﻦ‬ ( X = cancer]'data'[ =‫ﺍﻟﺴﺮﻃﺎﻥ‬ load_breast_cancer) y = ['‫ﺍﻟﺴﺮﻃﺎﻥ]'ﺍﻟﻬﺪﻑ‬ MI ‫ ﺣﺴﺎﺏ ﻗﻴﻢ‬.2‫ﺍﻟﺨﻄﻮﺓ‬ mutual_info_classif ‫ ﺍﺳﺘﻴﺮﺍﺩ‬sklearn.feature_selection‫ﻣﻦ‬ = mutual_info_classif)X,y( print)mi_values( mi_values :‫ ﻣﺜﻞ ﻫﺬﺍ‬mi_values ‫ﺳﻮﻑﺗﺮﻯ ﻣﺠﻤﻮﻋﺔ‬ 0.37337 0.21318 0.08427 0.36009 0.40294 0.09670 0.37032] 0.33955 0.27600 0.00189 0.24866 0.00276 0.064560.43985 0.45151 0.03802 0.00967 0.12879 0.11825 07603.00.01503 0.43696 0.31469 0.22647 0.09558 0.46426 0.475950.12293 [0.067350.09717.‫ ﻣﻴﺰﺓ‬30 ‫ ﻟـ‬MI ‫ﺭﻗﻤﺎً ﺗﻤﺜﻞ ﺩﺭﺟﺔ‬30 57 ‫ﻣﻘﺎﻳﻴﺲﺍﻟﺘﻜﺮﺍﺭ ﺍﻟﻤﻤﻴﺰ‬ ‫ﻛﻤﺎﺭﺃﻳﻨﺎ‪ ،‬ﻓﺈﻥ ﺍﻟﺘﻜﺮﺍﺭ ﺍﻟﻤﻤﻴﺰ ﻫﻮ ﻣﺴﺎﻫﻤﺔ ﺍﻟﻤﻌﻠﻮﻣﺎﺕ ﺍﻟﻤﺘﺸﺎﺑﻬﺔ ﻣﻦ‬ ‫ﻥ‬ ‫ﺧﻼﻝﻣﻴﺰﺍﺕ ﻣﺘﻌﺪﺩﺓ‪.‬ﻭﻫﻨﺎﻙ ﻣﻘﺎﻳﻴﺲ ﻣﺘﻌﺪﺩﺓ ﻟﺘﺸﺎﺑﻪ ﻣﺴﺎﻫﻤﺔ‬ ‫ﺍﻟﻤﻌﻠﻮﻣﺎﺕ‪:‬‬ ‫ﻣﻘﺎﻳﻴﺲﺗﻌﺘﻤﺪ ﻋﻠﻰ ﺍﻻﺭﺗﺒﺎﻁ‬ ‫ﺃ‬ ‫ﻣﻘﺎﻳﻴﺲﺗﻌﺘﻤﺪ ﻋﻠﻰ ﺍﻟﻤﺴﺎﻓﺔ‬ ‫ﺃ‬ ‫ﻣﻘﺎﻳﻴﺲﺃﺧﺮﻯ ﻟﻠﺘﺸﺎﺑﻪ‬ ‫ﺃ‬ ‫‪ (1‬ﻣﻘﻴﺎﺱ ﺍﻟﺘﺸﺎﺑﻪ ﺍﻟﻘﺎﺉﻢ ﻋﻠﻰ ﺍﻻﺭﺗﺒﺎﻁ‬ ‫ﺍﻻﺭﺗﺒﺎﻁﻫﻮ ﻣﻘﻴﺎﺱ ﻟﻠﻌﻼﻗﺔ ﺍﻟﺨﻄﻴﺔ ﺑﻴﻦ ﻣﻴﺰﺗﻴﻦ‪.‬‬ ‫ﻥ‬ ‫ﻳﻌﺪﻣﻌﺎﻣﻞ ﺍﺭﺗﺒﺎﻁ ﺑﻴﺮﺳﻮﻥ ﺃﺣﺪ ﺃﻛﺜﺮ ﻣﻘﺎﻳﻴﺲ ﺍﻻﺭﺗﺒﺎﻁ ﺷﻴﻮﻋﺎً‪.‬‬ ‫ﻥ‬ ‫‪58‬‬ ‫ﻟﻤﻴﺰﺗﻴﻦﻑ‪ 1‬ﻭﻑ‪ -2‬ﻣﻌﺎﻣﻞ ﺍﺭﺗﺒﺎﻁ ﺑﻴﺮﺳﻮﻥ ﻳﻌﺮﻑ ﻋﻠﻰ‬ ‫ﻥ‬ ‫ﺍﻟﻨﺤﻮﺍﻟﺘﺎﻟﻲ‪:‬‬ ‫ﺗﺘﺮﺍﻭﺡﻗﻴﻢ ﺍﻻﺭﺗﺒﺎﻁ ﺑﻴﻦ ‪ 1+‬ﻭ‪.1-‬ﻳﺸﻴﺮ ﺍﻻﺭﺗﺒﺎﻁ ‪ +) 1‬ﺃﻭ ‪ (-‬ﺇﻟﻰ ﺍﺭﺗﺒﺎﻁ‬ ‫ﻥ‬ ‫ﻣﺜﺎﻟﻲ‪،‬ﺃﻱ ﺃﻥ ﺍﻟﻤﻴﺰﺗﻴﻦ ﻟﻬﻤﺎ ﻋﻼﻗﺔ ﺧﻄﻴﺔ ﻣﺜﺎﻟﻴﺔ‪.‬ﻓﻲ ﺣﺎﻟﺔ ﺃﻥ ﺍﻻﺭﺗﺒﺎﻁ‬ ‫ﻳﺴﺎﻭﻱ‪ ،0‬ﻓﺈﻥ ﺍﻟﻤﻴﺰﺗﻴﻦ ﻟﻴﺲ ﻟﻬﻤﺎ ﻋﻼﻗﺔ ﺧﻄﻴﺔ‪.‬‬ ‫ﺑﺸﻜﻞﻋﺎﻡ‪ ،‬ﻳﺒﺪﻭ ﺃﻥ ﺍﻟﻤﻴﺰﺗﻴﻦ ﺍﻟﻠﺘﻴﻦ ﺗﺮﺑﻄﻬﻤﺎ ﻋﻼﻗﺔ ﺧﻄﻴﺔ‬ ‫ﻥ‬ ‫ﻗﻮﻳﺔﺗﺘﻤﺘﻌﺎﻥ ﺑﺈﻣﻜﺎﻧﻴﺔ ﺍﻟﺘﻜﺮﺍﺭ‪.‬‬ ‫‪59‬‬ ‫‪ (2‬ﻣﻘﻴﺎﺱ ﺍﻟﺘﺸﺎﺑﻪ ﺍﻟﻘﺎﺉﻢ ﻋﻠﻰ ﺍﻟﻤﺴﺎﻓﺔ‬ ‫ﻣﻘﻴﺎﺱﺍﻟﻤﺴﺎﻓﺔ ﺍﻷﻛﺜﺮ ﺷﻴﻮﻋﺎً ﻫﻮﺍﻟﻤﺴﺎﻓﺔ ﺍﻹﻗﻠﻴﺪﻳﺔﻭﺍﻟﺘﻲ ﻳﺘﻢ‬ ‫ﻥ‬ ‫ﺣﺴﺎﺑﻬﺎﺑﻴﻦ ﻣﻴﺰﺗﻴﻦﻑ‪1‬‬ ‫ﻭﻑ‪2‬ﻣﺜﻞ‪:‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺍﻟﻤﺴﺎﻓﺔ ﺍﻹﻗﻠﻴﺪﻳﺔ ﺑﻴﻦ ﺍﻟﻤﻴﺰﺗﻴﻦ ‪) aptitude‬ﻑ‪(1‬‬ ‫ﻥ‬ ‫ﻭﺍﻟﺘﻮﺍﺻﻞ)ﻑ‪ (2‬ﻛﻤﺎ ﻫﻮ ﻣﻮﺿﺢ‪:‬‬ ‫!"‪81.75 =%"،"#‬‬ ‫= ‪9.04‬‬ ‫‪60‬‬ ‫ﺍﻟﻤﺴﺎﻓﺔﻫﻲ‬ ‫ﺟﻴﻦﺃﻛﺜﺮ‬ ‫ﻥ‬ ‫ﻣﻴﻨﻜﻮﻓﺴﻜﻲ‬ ‫ﻣﺘﻰﺭ=‪ ،2‬ﻣﺴﺎﻓﺔ ﻣﻴﻨﻜﻮﻓﺴﻜﻲ ﺗﺄﺧﺬ ﺷﻜﻞ ﺍﻟﻤﺴﺎﻓﺔ ﺍﻹﻗﻠﻴﺪﻳﺔ)‬ ‫ﺃ‬ ‫ﻭﻳﺴﻤﻰﺃﻳﻀﺎﻝ‪2‬ﺍﻟﻘﺎﻋﺪﺓ(‪.‬‬ ‫ﻣﺘﻰﺭ=‪ ،1‬ﻣﺴﺎﻓﺔ ﻣﻴﻨﻜﻮﻓﺴﻜﻲ ﺗﺄﺧﺬ ﺷﻜﻞ ﻣﺴﺎﻓﺔ ﻣﺎﻧﻬﺎﺗﻦ)‬ ‫ﺃ‬ ‫ﻭﻳﺴﻤﻰﺃﻳﻀﺎﻝ‪1‬ﺍﻟﻘﺎﻋﺪﺓ( ﻣﺜﻞ‪:‬‬ ‫ﻣﺜﺎﻝﻋﻤﻠﻲ ﻋﻠﻰ ﻣﺴﺎﻓﺔ ﻣﺎﻧﻬﺎﺗﻦ ﻫﻮ ﻣﺴﺎﻓﺔ ﻫﺎﻣﻴﻨﺞﻭﺍﻟﺘﻲ‬ ‫ﻥ‬ ‫ﺗﺴﺘﺨﺪﻡﺑﺸﻜﻞ ﻣﺘﻜﺮﺭ ﻟﻠﺤﺴﺎﺏ‬ ‫‪61‬‬ ‫‪ (3‬ﺃﻭﺕ‬ ‫ﻣﺆﺷﺮﺟﺎﻛﺎﺭﺩﻳﺘﻢ ﺍﺳﺘﺨﺪﺍﻣﻪ ﻛﻤﻘﻴﺎﺱ ﻟﻠﺘﺸﺎﺑﻪ ﺑﻴﻦ ﻣﻴﺰﺗﻴﻦ‪.‬ﻣﺴﺎﻓﺔ‬ ‫ﻥ‬ ‫ﺟﺎﻛﺎﺭﺩ‪ ،‬ﻭﻫﻮ ﻣﻘﻴﺎﺱ ﻟﻼﺧﺘﻼﻑ ﺑﻴﻦ ﻣﻴﺰﺗﻴﻦ‪ ،‬ﻭﻫﻮ ﻣﻜﻤﻞ ﻟﻤﺆﺷﺮ ﺟﺎﻛﺎﺭﺩ‪.‬‬ ‫ﺑﺎﻟﻨﺴﺒﺔﻟﻤﻴﺰﺗﻴﻦ ﻟﻬﻤﺎ ﻗﻴﻢ ﺛﻨﺎﺉﻴﺔ‪،‬ﻣﺆﺷﺮ ﺟﺎﻛﺎﺭﺩﻳﻜﻮﻥ‬ ‫ﻥ‬ ‫ﺍﻟﻘﻴﻤﺔ‪1‬‬ ‫ﺍﻟﻤﻴﺰﺓﺍﻟﺜﺎﻧﻴﺔ ﻟﻬﺎ ﺍﻟﻘﻴﻤﺔ ‪1‬‬ ‫ﻭﺍﻟﻤﻴﺰﺓﺍﻟﺜﺎﻧﻴﺔ ﻟﻬﺎ ﺍﻟﻘﻴﻤﺔ ‪0‬‬ ‫‪#‬‬ ‫=‪0.4 =#‬‬ ‫ﺝ=‬ ‫&‬ ‫‪#%#%%‬‬ ‫‪ ,‬ﺩﻭﻻﺭ‪0 − 1 =#, ,‬‬ ‫‪62‬‬ ‫ﺗﺸﺎﺑﻪﺟﻴﺐ ﺍﻟﺘﻤﺎﻡﻭﻫﻮ ﺃﺣﺪ ﺃﻛﺜﺮ ﺍﻟﻤﻘﺎﻳﻴﺲ ﺷﻴﻮﻋﺎً ﻓﻲ ﺗﺸﺎﺑﻪ‬ ‫ﻥ‬ ‫ﺍﻟﻨﺼﻮﺹ‪.‬‬ ‫ﻧﺤﻦﻧﻌﻠﻢ ﺃﻥ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻨﺺ ﺗﺤﺘﺎﺝ ﺃﻭﻻ ًﺇﻟﻰ ﺗﺤﻮﻳﻠﻬﺎ ﺇﻟﻰ ﻣﻴﺰﺍﺕ‪ ،‬ﺣﻴﺚ‬ ‫ﻥ‬ ‫ﺗﻜﻮﻥﻛﻠﻤﺔ ﺍﻟﺮﻣﺰ ﻫﻲ ﺍﻟﻤﻴﺰﺓ‪ ،‬ﻭﻳﺄﺗﻲ ﻋﺪﺩ ﺍﻟﻤﺮﺍﺕ ﺍﻟﺘﻲ ﺗﻈﻬﺮ ﻓﻴﻬﺎ ﺍﻟﻜﻠﻤﺔ‬ ‫ﻓﻲﺍﻟﻤﺴﺘﻨﺪ ﻛﻘﻴﻤﺔ ﻓﻲ ﻛﻞ ﺻﻒ‪.‬‬ ‫ﺃﻭﺭﺱ !ﻭ " ﻳﺘﻢ ﺗﻌﺮﻳﻔﻬﺎ‬ ‫ﺗﺸﺎﺑﻪﺟﻴﺐ ﺍﻟﺘﻤﺎﻡ‬ ‫ﻥ‬ ‫ﺑﻮﺍﺳﻄﺔ‪:‬‬ ‫&')!*"*‬ ‫=∑(‬ ‫ﺣﻴﺚ‪ ".! ،‬ﻫﻮ ﺣﺎﺻﻞ ﺿﺮﺏ ﻧﻘﻄﻲ ﻣﺘﺠﻬﻲ ﻟـ ! ﻭ "‬ ‫‪63‬‬ ‫ﻋﻠﻰﺳﺒﻴﻞ ﺍﻟﻤﺜﺎﻝ‪ ،‬ﺩﻋﻨﺎ ﻧﺤﺴﺐ ﺗﺸﺎﺑﻪ ﺟﻴﺐ ﺍﻟﺘﻤﺎﻡ ﻟﻠﺠﻤﻞ ﺍﻟﻤﻘﺴﻤﺔ‬ ‫ﻥ‬ ‫ﺇﻟﻰﺃﺟﺰﺍءﺱﻭﻱ‪ ،‬ﺃﻳﻦﺱ= )‪ (0 ،0 ،3 ،1 ،2 ،0 ،0 ،4 ،2‬ﻭﻱ= )‪،0 ،1 ،2‬‬ ‫‪.(1 ،0 ،1 ،2 ،3،0‬‬ ‫!‪=! 19 = 1∗0 + 0∗0 + 1∗3 + 2∗1 + 3∗2 + 0∗0 + 0∗0 + 1∗4 + 2∗2= #.‬‬ ‫‪5.83 =34‬‬ ‫‪='0 +'0 +'3 +'1 +'2 +'0 +'0 +'4 +'2‬‬ ‫‪4.47 =20‬‬ ‫‪='1 +'0 +'1 +'2 +'3 +'0 +'0 +'1 +'2‬‬ ‫‪=#‬‬ ‫‪19‬‬ ‫!‪#.‬‬ ‫= ‪0.729‬‬ ‫=‬ ‫ﻷﻥ !‪= #,‬‬ ‫‪4.47 ∗5.83‬‬ ‫‪#‬‬ ‫!∗‬ ‫ﻓﻲﺍﻟﻮﺍﻗﻊ‪ ،‬ﻳﻘﻴﺲ ﺗﺸﺎﺑﻪ ﺟﻴﺐ ﺍﻟﺘﻤﺎﻡ ﺍﻟﺰﺍﻭﻳﺔ ﺑﻴﻦ ﺍﻟﻤﺘﺠﻬﻴﻦ ! ﻭ‪.#‬‬ ‫ﻥ‬ ‫ﻭﺑﺎﻟﺘﺎﻟﻲ‪،‬ﺇﺫﺍ ﻛﺎﻧﺖ ﻗﻴﻤﺔ ﺗﺸﺎﺑﻪ ﺟﻴﺐ ﺍﻟﺘﻤﺎﻡ ‪ ،1‬ﻓﻬﺬﺍ ﻳﻌﻨﻲ ﺃﻥ ﺍﻟﺰﺍﻭﻳﺔ‬ ‫ﺑﻴﻦ !ﻭ‪ #‬ﻫﻲ ‪ ،°0‬ﻣﻤﺎ ﻳﻌﻨﻲ ﺃﻥ ! ﻭ‪ #‬ﻣﺘﻤﺎﺛﻼﻥ ﺑﺎﺳﺘﺜﻨﺎء ﺍﻟﻤﻘﺪﺍﺭ‪.‬ﻭﺇﺫﺍ‬ ‫ﻛﺎﻧﺖﻗﻴﻤﺔ ﺗﺸﺎﺑﻪ ﺟﻴﺐ ﺍﻟﺘﻤﺎﻡ ‪ ،0‬ﻓﺈﻥ ﺍﻟﺰﺍﻭﻳﺔ ﺑﻴﻦ ! ﻭ‪ #‬ﻫﻲ ‪.°90‬‬ ‫ﻭﺑﺎﻟﺘﺎﻟﻲ‪،‬ﻻ ﻳﺸﺘﺮﻛﺎﻥ ﻓﻲ ﺃﻱ ﺗﺸﺎﺑﻪ )ﻓﻲ ﺣﺎﻟﺔ ﺑﻴﺎﻧﺎﺕ ﺍﻟﻨﺺ‪ ،‬ﻻ ﺗﻮﺟﺪ‬ ‫ﻛﻠﻤﺔﻣﺸﺘﺮﻛﺔ(‪.‬ﻓﻲ ﺍﻟﻤﺜﺎﻝ ﺃﻋﻼﻩ‪ ،‬ﺗﺼﺒﺢ ﺍﻟﺰﺍﻭﻳﺔ ‪.°43.2‬‬ ‫‪64‬‬

Use Quizgecko on...
Browser
Browser