|
Keywords
Sensory substitution (SSDs)
Visual rehabilitation
Visual plasticity
Blind
1. Introduction
In this review we describe approaches to using sensory substitution devices (SSDs) to help the visually impaired. Section 2 introduces the problem of visual rehabilitation in general, attempts to deal with this problem, and in particular experiments involving sensory substitution devices (SSDs). Section 3 briefly discusses the reasons for the limited adoption of SSDs. Section 4 presents recent theoretical, practical and technological advances. Section 5 puts forward some practical steps to bridge the gap between the use of SSDs for research and their applicability for practical visual rehabilitation in everyday use by the blind community.
2. The challenge of blindness, and visual rehabilitation approaches
In this section we describe the goals of visual rehabilitation (Section 2.1), current and near-future approaches (Section 2.2) and sensory substitution devices (Section 2.3), and explore whether “seeing” via sensory substitution devices counts as vision (Section 2.4).
2.1. Goals of visual rehabilitation
Over 285,000,000 people worldwide are affected by severe visual impairments, of whom nearly 40 million are blind. This constitutes both a clinical and scientific challenge to develop effective visual rehabilitation techniques (WHO, 2012). These visual impairments arise from a wide variety of etiologies, and in many cases require completely different types of treatment. Additionally, the vast majority of the visually impaired live in developing countries and in harsh economic conditions, such that any comprehensive solution must be both relatively cheap and easily available (Held et al., 2011, WHO, 2012).
2.2. Current and near-future invasive methodologies
There are a number of current approaches to visual rehabilitation (see Striem-Amit et al., 2011 for recent reviews of these and other methods). Invasive approaches aim at physically replacing or restoring the function of the peripheral visual system, for instance by using artificial retinal prostheses (Ahuja et al., 2011, Chader et al., 2009, Collignon et al., 2011a, Djilas et al., 2011, Humayun et al., 2012, Rizzo, 2011, Wang et al., 2012, Zrenner et al., 2011), gene therapy (Busskamp et al., 2010) or transplantation of photoreceptors (Yang et al., 2010). However, while in the long term these solutions hold great promise, they still face huge hurdles in terms of technical capabilities, ability to customize to specific etiologies (the type and severity of visual deterioration and the site of the lesion along the visual pathways), are extremely expensive, and only provide very low-resolution end-result sight (Humayun et al., 2012). In addition, even these limited results still require a very long and arduous visual rehabilitation process.
2.3. Sensory substitution devices (SSDs)
A different approach, known as Sensory Substitution, is designed to convey visual information to the visually impaired by systematically substituting visual information into one of their intact senses. Sensory substitution devices (SSDs) are non-invasive human–machine interfaces which, in the case of the blind, transform visual information into auditory or tactile representations using a predetermined transformation algorithm (see Fig. 1 for illustration).
Fig. 1. SSD – Left: an illustration of sensory substitution by tactile stimulation on the tongue (left) and auditory stimulation (right). Right: a sample setup with a small computer, bone-conductance headphones and camera glasses.
The first such structured substitution system is probably Braille reading. This technique, developed originally by Barbier as a means for writing and reading in the dark for the French military in the Napoleonic era, was later revised by Louis Braille to enable the blind to read by substituting visual letters with tactile ones. This was further developed in the early 1950s with the development of automatic text-to-braille converters such as the Optacon (Goldish and Taylor, 1974).
A highly interesting effort which is often neglected historically, was the Elektroftalm that attempted to electronically transform a visual image into auditory (late 1890s) and tactile (1950s) stimulation (Starkiewicz and Kuliszewski, 1965) using one or several sensors.
These early attempts led to the more organized and methodological attempts of Paul Bach-y-Rita in the 1970s, which positioned him as the pioneer of the extensive use of sensory substitution for research. Bach-y-Rita focused on tactile devices and specifically a prototype device he named the “Tactile Vision Sensory Substitution” (TVSS) which blind users could use for tasks such as recognizing large letters, catch a ball tossed at them and so on (Bach-y-Rita, 1972).
The work of Bach-y-Rita suggested that these devices could serve as stand-alone aids for limited daily use, providing otherwise non-existing visual capabilities such as perception of shape, color and location. Additionally, as SSDs are relatively low-cost they could be made accessible to the majority of the world's visually impaired population, who as mentioned above primarily reside in developing countries and have limited access to advanced medical treatment (Held et al., 2011, WHO, 2012).
SSDs have enormous potential for non-invasive rehabilitation for the majority of the blind. In over 95% of all cases of blindness, the problem is not in the visual/occipital parts of the brain but rather in the eye, retina or the visual pathways (WHO, 2012). In addition, in the subset of cases where the visual pathways between the ganglion cells and the visual cortex is damaged, approaches that repair the retina would not be able to convey the information from there to the brain, leaving SSDs as the main potential therapeutic approach. However, despite of several decades of research, the use of SSDs has hardly exploited this vast potential. Before we explore the reasons for this relative failure, and how this might be remedied in the near future based on recent theoretical practical and technological advances, it is worth inquiring how ‘seeing’ using SSDs compares to natural vision.
2.4. Can using SSDs be considered “seeing”?
Unlike invasive approaches, which intuitively are quite similar to the normal process of vision, can the use of SSDs really be considered as “seeing”? Although one could argue that sensory substitution, which mostly lacks visual qualia, is not truly ‘real’ vision, if ‘seeing’ is defined as the ability to create a mental representation of the shape, surface properties, and location of surrounding objects and to interact with them in a manner comparable to a normally sighted person (Bach-y-Rita, 1972), then SSDs indeed enable the blind to ‘see’ using their intact senses. For a more detailed recent discussion on the subject of defining the use of SSDs as vision, see Deroy and Auvray (2012), Connolly et al. (2013), and Ward and Meijer (2010).
Although anecdotal, this subjective testimony captures the experience of a late blind user of the vOICe SSD (Meijer, 1992) who had vision for about twenty years: “You can develop a sense of the soundscapes within two to three weeks. Within three months or so you should start seeing the flashes of your environment where you’d be able to identify things just by looking at them… It is sight,” she says. “I know what sight looks like. I remember it.” (Pat Fletcher in an article in ACB's Braille forum).
Physiological measurements of fMRI signals from the brain of late-blind subjects has shown significant activation in the visual object related areas (Amedi et al., 2001) and indicated that experimentally impeding activity in these same visual occipital cortex impaired their object recognition abilities when using SSDs (Merabet et al., 2009).
These cases show that the potential for qualia exists, even though current devices are far from rendering it completely.
3. Difficulties with using SSDs for visual rehabilitation
In this section we will discuss the general reasons why sensory substitution devices have not been widely adopted (Section 3.1), including problems with the devices themselves (Section 3.2) and the wider challenge of visual rehabilitation from both empirical (Section 3.3) and theoretical (Section 3.4) perspectives.
3.1. Why haven’t these devices been widely adopted?
Despite their promising potential and many years of development, SSDs have not been widely adopted. Only a few visual-to-auditory/tactile SSDs have ever been used outside of controlled research settings in the lab, and to the best of our knowledge no SSD has been adopted as the main tool by a wide blind community (Loomis, 2010). The underlying problem is two-fold. First, there are problems with the SSDs themselves, but further, there is a basic theoretical factor constraining their potential which is related to the limitations of visual rehabilitation in general.
3.2. Problems with the SSDs themselves
In the past, SSD adoption has faced a number of stumbling blocks. Devices were expensive, cumbersome, hard to set up and operate by the blind users, and were not efficient enough for real world use. Psychological and social factors, such as the reluctance to try new devices, have also hampered their adoption. However, the absence of organized training procedures is arguably the biggest obstacle. Potential users often have to train themselves on these devices at home without an instructor physically present or a clear set of lessons to follow. Thus although enabling access to visual information, SSDs were simply not practical enough for every-day tasks in the real world.
On the other hand, the use of SSDs for research has flourished. A range of results has been obtained using these devices for scientific exploration of the senses and sensory related issues, including among many others (Amedi et al., 2007, Auvray et al., 2007, Matteau et al., 2010, Renier and De Volder, 2010, Stiles et al., 2012, Wright et al., 2012), some of which will be elaborated on in Section 4. Unfortunately, this body of research has contributed only partially to rehabilitation efforts since the participants in laboratory studies mainly undergo controlled training tailored for specific experiments but does not deal with important factors such as active sensing and closing the sensory-motor loop (Reynolds and Glenney, 2012) or acquiring generalization skills (Kim and Zatorre, 2008).
To summarize, SSDs are currently treated mainly as research tools, which has turned the spotlight away from their original goal of visual rehabilitation.
3.3. General limitations on visual rehabilitation
These SSD specific factors are only part of the broader limitations on devices in general that have hampered visual rehabilitation for the congenitally blind, and to a lesser extent the late-blind as well. Researchers and clinicians have questioned whether full visual rehabilitation is even possible. Previous attempts at sight restoration in the adult congenitally blind have not been a cause for celebration, and patients often regretted undergoing the procedure (Fine et al., 2003, Gregory and Wallace, 1963, Von Senden, 1960) since although the procedures themselves have been successful in several cases, the subjects were unable to achieve full visual function. The patients regained perception of light, and many were able to quickly learn to perceive motion and color, but their performance on tasks such as complex shape recognition, distance estimation and line contour integration was far below par, and even when successful required a far greater amount of time than expected. Worse, subjects reported feelings of pain and frustration from their new visual input, and most continued to live as though they were still blind.
3.4. Theoretical neurobiological basis for pessimism concerning visual rehabilitation
Why haven’t these treatments been successful? Why can’t the subjects learn to see? To answer these questions we will take a step back, and look at the larger neuroscience picture of how our brain processes sensory information, and at recent alternative models in this field, many of which derive from the findings mentioned in Section 3.2 and acquired using SSDs, before returning to our discussion of the practical usage of SSDs.
In traditional neuroscience, the common view is that the human brain is divided into the “visual cortex”, the “auditory cortex” and so on according to the sensory modality that elicits it, and into higher-order multisensory areas integrating information from these unimodal cortices (the sensory division-of-labor principle; Zeki, 1978). The vast majority of textbooks emphasize this organizational principle explicitly and implicitly. However, over the past decade there have been a growing number of articles suggesting that this view may not be fully accurate, a point which will be elaborated on further in this review in Section 4.1 (Amedi et al., 2001, Amedi et al., 2007, Merabet and Pascual-Leone, 2010, Pascual-Leone et al., 2005, Pascual-Leone and Hamilton, 2001, Pascual-Leone et al., 2011, Reich et al., 2011a, Renier et al., 2013).
It is well established that the ‘visual’ cortex of the blind becomes plastically recruited to process other modalities and even cognitive tasks such as language and memory (reviewed also in Frasnelli et al., 2011, Merabet and Pascual-Leone, 2010). Many of these changes start to occur within days following the onset of blindness (Pascual-Leone et al., 2005), and therefore affect not only the congenitally blind but also, though probably to a different extent, early and late blind individuals (Cohen et al., 1999, Lacey and Sathian, 2008, Sathian, 2005).
This plasticity may nevertheless be a double-edged sword. On the one hand, it helps the blind to better cope with blindness by supporting compensatory capabilities (Amedi et al., 2003, Amedi et al., 2004, Bedny et al., 2011, Gougoux et al., 2005, Rauschecker, 1995, Roder and Rosler, 2003) but at the same time, it may interfere with sight restoration efforts by altering the visual cortex's original functions. This means that attempts of rehabilitation pose risks not only when they fail, but also when they succeed. Even if we could somehow cause a plastic reorganization of the vision-deprived occipital cortex for processing the visual input, it might damage the use of these same areas for tasks customarily tapped for this purpose, potentially blocking habits and skills the individual has learned to rely on or impairing functions for which these areas were dedicated, such as memory. This problem impinges not only on the congenitally blind but to a lesser extent on the late blind as well, when their formerly “visual” areas are plastically partially recruited for other new tasks (Sathian and Lacey, 2007).
Another though not mutually exclusive mechanism often used to explain the failure of medical restoration cases in the congenitally blind is the existence of “critical periods” in early childhood in which the brain is particularly plastic, during which lack of vision may prevent the proper functional specialization and development of many of these visual regions. The watershed works of Hubel and Wiesel (1970) on the visual system of cats made this view one of the most basic tenets in visual research; namely that the visual system cannot regain function if infants cannot see during their first years of life.
This may account for some of the failures of the medical cases mentioned above. In these cases of failed sight rehabilitation, it seems as if the gained visual input was made available to a brain that was wholly unpracticed at analyzing and interpreting this input, and the visual experience gained at this stage without supervised explicit training, and in contrast to normal development, may come too little or too late.
In conclusion, the main explanations for sight restoration failures have to do with missing “critical periods” required for the development of visual areas in the brain, and/or the plastic recruitment of the occipital cortex for other, non-visual tasks in late and early blindness. Together, these features may impede the re-emergence of visual abilities in the newly sighted in adulthood, and render attempts at visual restoration very difficult and limited.
4. The basis for some optimism for visual rehabilitation
In this section we discuss the shift from pessimism to potential hope for visual rehabilitation. We review changes in theories of brain organization (Section 4.1) and their implications for visual rehabilitation (Section 4.2). We then review recent results indicating that the substituted information is integrated into a shared sensory framework (Section 4.3) and discuss recent behavioral achievements using SSDs (Section 4.4). Finally, we explore new results in the field of visual rehabilitation using other rehabilitation approaches and their potential for visual rehabilitation (Section 4.5).
4.1. A different view of brain organization and re-organization
It is well established that the visual cortex in most sighted humans has a hierarchical organization and is comprised of different functional areas, each processing different aspects of vision. For example the most fundamental large-scale division of labor of the visual cortex is between two functional processing streams for objects, the dorsal ‘how and where is it?’ and the ventral ‘what is it?’ streams. Moreover, even within the stream specialized for processing object identity, different functional areas show preferential activation for different object categories. For example, the Fusiform Face Area (FFA) shows preference for processing faces, the Middle Temporal gyrus (MT) for visual motion, and the Visual Word Form Area (VWFA) for script reading and visual representation of language.
Surprisingly, over the past few years several of these basic brain regions which were once considered “visual” according to the theory of division of labor were shown to retain their function even without visual experience. For example, until recently it was unknown whether the development of the basic division into the two functional streams depends on visual experience, although the critical period theory predicts this to be the case. However, as recently reviewed by Collignon and Lepore (2012) there is now evidence for the existence of the two stream distinction in the blind (see also Fiehler et al., 2009, Ptito et al., 2012) and recently a study using the vOICe SSD has shown that it can at least partially arise in the congenitally blind without any visual experience at all! (Striem-Amit et al., 2011; Fig. 2). Additionally, non-visual stimuli have been able to activate each of these two streams in the sighted as well when performing tactile perception tasks (Prather et al., 2004, Sathian et al., 2011).
Fig. 2. “Visual” ventral and dorsal streams – fMRI map of the dorsal/ventral visual pathway division of labor in adult congenitally blind participants using SSDs (Striem-Amit et al., 2011).
The same functional organization retention despite sensory deprivation can be observed when examining some of the smaller functional areas within the streams, such as the LOC for tactile object perception, imagery (Deshpande et al., 2010) and location (Amedi et al., 2001, Amedi et al., 2002) or visual-to-auditory SSD object location (Amedi et al., 2007), the VWFA for reading braille (Reich et al., 2011b) or reading via a visual-to-auditory SSD (Striem-Amit et al., 2012a), MT for non-visual motion (Matteau et al., 2010, Poirier et al., 2005, Poirier et al., 2006, Ptito et al., 2009, Ricciardi et al., 2007, Ricciardi et al., 2009, Summers et al., 2009), the mirror network for auditory perception of action (Ricciardi et al., 2009), MOG for sound localization (Collignon et al., 2011b, Renier and De Volder, 2010) and Parieto-Occipital reach-related regions (Lingnau et al., 2012). Strikingly, even listening to sound echoes can activate the visual rather than the auditory cortex in blind echolocation experts (Thaler et al., 2011).
The LOC, MT and VWFA provide the most detailed examples (see Fig. 3). LOC, the lateral occipital cortex, was first shown to be activated in both tactile object recognition (TOR) and visual objection recognition in the sighted, suggesting that part of LOC (LOtv) might actually be a sensory independent task operator for deciphering the geometrical shape of 2D and 3D objects (Amedi et al., 2001). In particular, the LOtv was activated by visual and tactile shape recognition tasks but not by object recognition by sound using associations (Amedi et al., 2002). This claim was paralleled by the theoretical framework suggested that same year by Pascual-Leone and Hamilton of the brain as a metamodal operator (Pascual-Leone and Hamilton, 2001). Other groups soon showed that vision and touch share shape information within the LOC (James et al., 2002). Further research into the multisensory nature of LOC revealed its activation during mental imagery (Zhang et al., 2004) and recognition of familiar tactile objects (Lacey et al., 2010) and activation for tactile shape over texture (Stilla and Sathian, 2008). More recent research has confirmed these findings by showing peak activation in the LOC for TOR without visual experience (Amedi et al., 2010) and by findings on the retrieval of shape information in the sighted, late blind and congenitally blind using visual-to-auditory SSDs (Amedi et al., 2007, Amedi et al., 2010, Lacey et al., 2009).
Fig. 3. Meta-modal research – Top: results obtained using SSD as sensory input. Left: LOC activated for SSD object recognition (Adapted from Amedi et al. (2007)). Right: VWFA activated for SSD letter reading (Adapted from Striem-Amit et al. (2012a)). Bottom: results obtained using tactile sensing as sensory input. Left: LOC activated bilaterally for left hand tactile object recognition (Adapted from Amedi et al. (2010)). Right: VWFA activated for Braille reading (Adapted from Reich et al. (2011a)).
Several different experimental approaches have shown that the MT or Middle Temporal gyrus, also known as V5, the area that processes visual motion is activated for tactile motion in the absence of vision (Ptito et al., 2009) and electro-tactile motion is perceived on the tongue via the TDU SSD (Matteau et al., 2010).
Similarly the VWFA, a ventral visual area that processes visual written language in the sighted, is used for reading Braille, which is a tactile process, and is also the location for the peak of selective activation to Braille words in the congenitally blind (Reich et al., 2011b). These results were further expanded by Striem-Amit et al. (2012a) who showed that the VFWA is activated when using the vOICe SSD for reading regular letters via an auditory soundscape. The VWFA was also shown to have exactly the same selectivity for letter strings vs. ANY other category in both vision and visual-to-auditory SSDs even in subjects who have never seen.
All of these results contribute to a growing body of evidence accumulating over the last decade that challenges the canonical view of the sensory-specific brain. This evidence demonstrates that in both sighted and blind individuals the occipital visual cortex is not purely visual and that its functional specialization is independent of visual input (reviewed in Reich et al., 2011a, Ricciardi and Pietrini, 2011; and detailed below), despite showing a clear preference for the visual modality. This in turn has led to the hypothesis that the brain is task-oriented and sensory-modality independent (Reich et al., 2011a, Reich et al., 2011b, Striem-Amit et al., 2011), or in other words a “task machine”. Thus although the brain regions each show a preference for a specific modality or set thereof, they can still perform their specific task if they receive relevant information, regardless of the sensory input channel through which this information reached it.
Furthermore, as discussed above, recent evidence from the fully congenitally blind has shown that in some cases the same specialization emerges even without any visual experience or memories (Amedi et al., 2007, Collignon et al., 2011b, Fiehler et al., 2009, Mahon et al., 2009, Matteau et al., 2010, Ptito et al., 2009, Reich et al., 2011b, Striem-Amit et al., 2012a, Striem-Amit et al., 2012b), and can occur rapidly once the brain is trained to interpret the relevant information, suggesting that cortical functional specialization can be attributed at least partially to innately determined constraints (Striem-Amit et al., 2012b). Support for the task-machine brain hypothesis comes from findings on the auditory cortex in the deaf animals as well (Lomber et al., 2010) as well as in at least two anecdotal single case studies testing causality by disrupting the activity in these regions in the blind (Hamilton et al., 2000, Merabet et al., 2009). For reviews on this topic see (Bavelier and Hirshorn, 2010, Dormal and Collignon, 2011, Reich et al., 2011a).
This view of brain organization is consistent with several similar theories, such as the Metamodal (Pascual-Leone and Hamilton, 2001) and Supramodal (Kupers et al., 2011, Pietrini et al., 2004, Ricciardi and Pietrini, 2011) theories of brain organization. Although the specific differences in definition between these theories are beyond the scope of this review, all three offer a similar positive potential for visual rehabilitation.
These findings suggest that while indeed showing some preference for information from a specific sense, most higher order visual areas might be task-based and not sensory-based. However, it is worthwhile noting that there is still considerable controversy whether early retinotopic areas such as V1 which are linked directly to the sensory organs of one modality, also behave as task machines or whether this is limited to regions more distant from direct input modalities. It is clear though that these area do indeed show crossmodal organization and plasticity, which makes it especially important to continue to test whether this task specific metamodal organization can occur in them after training.
An additional bias inherent to these experiments is that they have mostly been conducted on populations of the congenitally blind. It is thus unclear how these data relate to the late blind. The cause for this may have been the attempt to avoid confounds such as visual imagery, and the difference caused by possible critical periods of development which the late-blind have experienced but the congenitally blind have not. However, while the findings from comparative studies indeed tend to indicate differences in results between the sighted, congenitally blind and late-blind, they all still exhibit the basic existence of task based neural activation (Thaler et al., 2011, Amedi et al., 2007, Renier and De Volder, 2010). On the other hand, several conflicting results suggest that for the late blind the preservation of functional specialization for other senses may only be on a level comparable to the results from sighted subjects (Dormal et al., 2012).
4.2. A neurobiological basis for optimism for visual rehabilitation
If the hypothesis of the highly flexible task-oriented sensory-independent/metamodal/supramodal brain is borne out, the absence of visual experience should not limit the task-specialization of the visual system, despite its recruitment for various functions in the blind, and the visual cortex of the blind may still be able to retain its functional properties using other sensory-modalities. This is very encouraging with regard to the potential for visual rehabilitation, and may form the theoretical basis for the new empirical evidence of success in rehabilitation which will be discussed below.
Note that many of these results were achieved using SSDs, and in particular after training using SSDs, as part of the research results mentioned in Section 3.2. These results also suggest that SSD training might be useful for shaping the brain to interpret input coming from other SSDs and devices, a point we elaborate upon in Sections 5.2 The importance of training, 5.6 Augmenting retinal prostheses and residual vision – a combined vision-rehabilitation device (VRD).
4.3. The substituted sense shares sensory space
Adding to this optimism, Ward & Wright have recently used an audio–visual mismatch paradigm conveyed by visual information and visual-to-auditory sensory substitution information to show the existence of a shared mental visual workspace between them (Wright et al., 2012). Similarly, it has recently been shown (Levy-Tzedek et al., 2012b) that information acquired through SSDs can be integrated into the shared multisensory perceptual grasp of our environment using a rotation-reaching task using information from vision and from a vision-to-auditory SSD. These results hint at a mental transfer suggestive of a shared sensory representation.
4.4. Additional new behavioral results
But if the suggestions of the previous two sections are true, why have the blind been unable to regain visual function using previous types of visual rehabilitation? The surprising answer is that with time, and to a certain very limited extent, they actually were able to regain more visual functions than previously expected. While still very far from “normal” vision, these achievements offer the users practical tools and skills for dealing with a wide variety of otherwise inaccessible visual-based tasks. In this section we discuss this claim in the light of recent results with SSDs, and then in the next section briefly report similar recent evidence using other forms of visual rehabilitation.
A small number of individuals who have used SSDs extensively have been able to acquire some practical every-day life skills, and advanced visual functions such as depth perception (Ward and Meijer, 2010) and recent behavioral results with SSDs on larger numbers of subjects who had less experience with the devices have surpassed by far the theoretical limit set by previous theories.
In recent experiments the performance of some blind users (Striem-Amit and Amedi, 2012) even exceeded the threshold for the World Health Organization (WHO) definition of blindness on the Snellen acuity test, showing that at least “legally” the subjects are no longer functionally fully blind (see Fig. 4 for more examples), but rather on par with the severely visually impaired. These results are consistent with findings on blindfolded participants with no prior SSD experience (Haigh et al., 2013).
Fig. 4. Examples of behavioral tasks performed using SSDs.
Furthermore, it has recently been shown (Striem-Amit et al., 2012a) with a small number of congenitally blind individuals that 70 h of training with the vOICe SSD not only enable the blind to be taught to read using the device, but also enables them to discriminate a wide set of categories (letters, textures, faces, houses, objects, body shapes, geometrical shapes), and perform difficult tasks such as recognizing facial expressions.
Even after only brief training, a small number of congenitally blind individuals have been able to recognize patterns (Poirier et al., 2007), perform motion discrimination and tracking tasks (Chekhchoukh et al., 2011, Ptito et al., 2009), extract depth cues and estimate object distance (Renier and De Volder, 2010), match socks by color (Bologna et al., 2009), and recognize. Blind users have performed basic navigational tasks such as walking down a corridor and opening doors, detecting and avoiding obstacles during effective navigation within a human-sized obstacle course (Chebat et al., 2011), walking along a colored line (Bologna et al., 2009), recognizing different virtual routes (Kupers et al., 2010) and locations, objects and landmarks encountered on the way while navigating real-world streets such as buildings, crosswalks, fences and trees. They have used SSDs for finding an object in a room, differentiating between different types of fruit, and locating light sources (Amedi et al., 2007, Bologna et al., 2010, Capalbo and Glenney, 2009, Durette et al., 2008, Reynolds and Glenney, 2009). The blind users were even able to perform “eye”-hand coordination such as locating targets, pointing to them and reaching for them (Levy-Tzedek et al., 2012a), placing rings on a cone, winning a game of miniature bowling, recognizing an unoccupied chair, etc. (Maidenbaum and Amedi, 2012).
While there is relatively little published research on the late blind, their behavioral achievements using SSDs seem to match those attained by the congenitally blind, and indeed some of the best achievements were made by members of this group (Maidenbaum and Amedi, 2012, Ward and Meijer, 2010).
Nevertheless these behavioral achievements have mostly been obtained in lab settings as part of research programs focused on obtaining answers to specific experimental questions and not dealing with visual rehabilitation. Thus, they can be seen as a proof-of-concept that far better results could be achieved with a program dedicated to rehabilitation.
It should also be emphasized that these results are of course still far from that which can be obtained with full vision, but are far better than previously expected, and show that SSDs have concrete practical uses.
4.5. Insights from other types of visual restoration
These results are complemented by recent reports from studies using other types of visual restoration. Researchers at the “Prakash” (Sanskrit for “Light”) project in India that aims to restore vision through conventional treatment generally unavailable in the third world such as cataract removal and corneal transplants, have reported behavioral results which seem far more promising than expected for these patients, including adults. While results immediately after surgery are typical of the 20th century outcomes described in Section 3.3 that involve gaining some basic visual skills but failing to gain others, follow-up exams conducted several months later note a significant improvement in many of these skills over time, such as the ability to count 3D objects, line contour integration, name an object in a noisy image, count overlapping objects, etc., all of which provide a significantly more optimistic outlook (Held et al., 2011, Ostrovsky et al., 2009), and challenge the strict view that critical periods are completely irreversible (Thomas, 2011).
Specifically, in a case-study conducted in 2006, Sinha reported (Ostrovsky et al., 2006) that SRD, a patient who was blind until the age of 12 and whose initial performance seemingly echoed the pessimistic view, had mastered a wide variety of visual tasks 20 years later, enabling her to function independently as a person with low-vision. Thus, the combination of time and training may enable the newly sighted to learn to see beyond the level once thought possible.
Another advancing field with similar results is the use of retinal implants. Patients implanted with such a prosthesis have been able to perform tasks such as reading letters, locating objects, etc. beyond the expected ability from the pessimistic theory, though still at a level far from satisfactory (Dagnelie, 2012), and to date there are very few congenitally blind individuals with such implants.
It should be noted that in both of these cases success was limited and time consuming. While there has been no direct comparison between these subjects and SSD users, initial comparison of the reported results indicate that SSDs may currently hold the upper hand both in terms of accomplishments and in terms of the time needed to achieve this milestone. As elaborated on in Section 5.6 we believe that a synergistic attempt which would use SSDs with these patients before and after they undergo surgery could increase their success rate significantly.
5. Outlining several future practical steps
There are several additional factors which we believe have the potential to change the current status of SSD usage in clinical settings. These include technological advances that will make these devices both more user friendly and upgrade their capabilities (Section 5.1), advances in training programs (Section 5.2), harnessing the power of the web, through the use of virtual (Section 5.3) and real-world simplified and safe environments to prepare for real world experiences (Section 5.4), making better use of the advantages offered by active sensing (Section 5.5) and finally synergistically combining SSDs and other methods of visual rehabilitation (Section 5.6). These factors are summarized in Fig. 5.
Fig. 5. Practical visual rehabilitation using SSDs: from pessimism to potential hope.
5.1. The potential of future devices and technological advances
We believe that the recent achievements detailed in Section 4.4 are only the first step toward fulfilling the potential of these devices, and that current technological advances will play a great part in this process.
Recent technological advances are continuously contributing to the three main modules which comprise an SSD. These include input sensors capturing visual information, a processing unit that extracts data and generates the representation, and an output human machine interface that portrays the data to the blind user. Newly available input sensors offer SSD users new parameters such as depth and color information. For example, depth information can be captured using stereo imaging (Akhter et al., 2011), 3D IR cameras such as Microsoft Kinect (Zöllner et al., 2011) and TOF imagining (Callaghan and Mahony, 2010, Zeng, 2012) and color via standard cameras (Deville et al., 2009, Levy-Tzedek et al., 2012b). In addition, as a general trend, higher quality cameras are becoming cheaper, smaller and more widely available due to their integration into smartphones (Akhter et al., 2011, Blum et al., 2012), including depth cameras (Firth, 2013). Initial results using color by SSDs such as EyeMusic (Levy-Tzedek et al., 2012a, Levy-Tzedek et al., 2012b) or SeeColOr (Bologna et al., 2009) now enable tasks ranging from simple color discrimination, to tasks which require more complex usage of the color information such as reaching for the correct object, following the right line or even matching socks.
On the processing unit front, two main trends are worth mentioning. The software trend includes the integration of computer vision algorithms (HVSS; Tan et al., 2010, Zeng, 2012) and machine learning techniques (Zeng, 2012) to filter the captured signals, which helps reduce the mental load, and enables the extraction of important objects and signs (Lescal et al., 2013, Tanveer et al., 2012, Zeng, 2012) as well as their simplification for user convenience. This software can be adapted from external packages, by taking advantage of research in other fields such as image processing and computer vision, especially when given additional visual information about the scene such as depth maps. In addition, there are new attempts to better refine the transformation algorithms based on recent literature in psychophysics to increase intuitiveness (Stiles and Shimojo, 2013) and computational tools such as genetic algorithms (Wright and Ward, 2013).
The hardware trend is mainly affected by the flourishing process of component miniaturization and the fact that smartphones offering a simple programming API are starting to be very widespread. This provides developers with a powerful yet small and elegant mobile computing platform that can be adapted to veteran SSDs by combining them with smartphone implementations (the vOICe, the EyeMusic) or new SSDs that have smartphones as their main platform (the HVSS). Moreover, improvements in user accessibility (such as native speech and hand gesture input APIs) make these devices easier to operate by blind users without any sighted aid. New platforms such as Google Glass, which consists of a glasses-mounted augmented reality platform, and Or-Cam, which offers augmented reality information to the visually impaired, could also serve as a dedicated unobtrusive platform for future SSD development.
The output human machine interface is the module that is currently the furthest from meeting expectations, especially for tactile SSDs. That said, better haptic feedback is being produced in many projects (among many others see Zeng, 2012, Schorr et al., 2013, Visell, 2009). In one interesting project, electrical actuators, similar to those used in the expensive TDU for tongue stimulation, have been recreated into a low resolution open-source SSD, thus increasing their accessibility (Dublon and Paradiso, 2012). Two interesting projects currently under development which hold potential for future interfaces are the Senseg (2012) and the Disney TeslaTouch (Bau et al., 2010), which attempt to augment touch screens by enabling the user to haptically sense the visual information displayed on them, which increases accessibility to parameters such as contour, texture and basic 3D information.
In the auditory domain, veteran SSDs use headphones which have become cheaper, less obtrusive and of better quality. It is worthwhile mentioning on this front the availability of bone-conductance headphones which keep the ears free to receive other auditory input. There has been greater use of stereo channels to map objects on each side of the image (Tan et al., 2010) and directional sound for better x-axis mapping, and in some cases even full location information (Bujacz et al., 2012). The main hurdle facing auditory SSDs is the auditory pleasantness of the cues, which while improving, are still far from satisfactory, and remain a major reason why users do not adopt these devices in everyday life (Wright and Ward, 2013).
Moreover, some SSDs are opting for a multisensory approach where the visual data is portrayed in both the auditory and tactile modalities in order to create a redundant representation which could potentially lead to higher performance (Stein and Meredith, 1993, Zeng, 2012) or provide the most appropriate information to the optimal channel by scenario recognition.
As new technologies emerge and technologies in the prototype phases mature, we believe that SSDs will have better visual processing algorithms and output interfaces, smaller apparatuses and at lower prices, thus removing many of the obstacles to widespread use of these devices have faced. However, we believe that without a concentrated effort to take these developments from the lab to the clinic, and without proper training and coordination with a large community of blind individuals, all these advances will remain more of a curiosity that attracts media exposure but remains in use mainly in labs or by a very small number of people, as has been the fate of many SSDs developed over the last ten years.
5.2. The importance of training
The profound importance of training is one of the clearest emerging and relatively new insights from working with patients using all types of visual rehabilitation including SSDs (Reynolds and Glenney, 2012, Striem-Amit et al., 2012b, Ward and Meijer, 2010), retinal implants (Dagnelie, 2012) and sight restoring surgery (Ostrovsky et al., 2006, Putzar et al., 2012, Xu et al., 2012).
Seeing is not as intuitive as is customarily believed, and the process of learning to process unconventional visual information can be long and difficult. The lack of basic understanding of visual principles taken for granted by the sighted is obvious after minutes of talking to any blind person. These are especially important for the congenitally and very-early blind who simply do not have these concepts, but is also true for the late blind trying to cope with atypical and degraded input such as input arriving from a prosthesis or from an SSDs. Concepts such as the fact that shading changes image seem foreign to them need to be learned, since shadows do not have an intuitive correlate in other modalities. The concept of visual perspective makes no sense to someone who is used to relying on auditory representations. Even the concept of size, and especially the fact that perceived size changes with distance, is odd despite the auditory corollary of signal strength. Transparent objects like windows or light bulbs baffle our subjects and the concept of mirrors evoke fear, especially the idea of seeing themselves (Maidenbaum and Amedi, 2012). For the congenitally blind, some visual percepts such as color which have no other sensory correlate require special focus when taught. The late blind, on the other hand, may have a better idea of what the visual information they are receiving is supposed to look like, but the transformation can be difficult, especially for visual rules they used subconsciously while still sighted. All of these rules and concepts need to be taught and explained.
Thus subjects who have received serious longitudinal training not only learn to use the devices far better, but are able to better process the visual information and learn visual principles. This live guidance is also important on an emotional level when dealing with concepts such as how others see you, and on a motivational level, especially when at first the stimuli are confusing and their interpretation is tiring.
There are several reports in the literature documenting how subjects who struggled to learn to see, and received patient and long-term support from people around them; such as SRD (Ostrovsky et al., 2006) and PF (Ward and Meijer, 2010). When working with our subjects, these improvements and the progress from one training session to the next are plain to see. Furthermore, when the training is structured and focused our subjects learn better and faster. Thus we believe that the creation and dissemination of such longitudinal training programs could significantly enhance the potential outcomes of visual rehabilitation.
To be maximally effective, we suggest that this training should include several rigorous first steps in which the basic functions of the device should be learned in a supervised fashion, followed by a series of more dynamic lessons utilizing tools such as dedicated virtual environments, games, active sensing and real environment training (see below). These lessons should incorporate a gradual increase in level of difficulty from static stimuli in the first hours to dynamic virtual games, and finally to basic real world tasks. They should also include dedicated modules for important object categories such as faces and home objects, and specific real-life tasks such as navigation and object location. Some of these modules should be an early part of the training process, to give users a feeling of accomplishment and the way they can acquire practical skills with the device. While there have already been some attempts at long-term training programs (Striem-Amit et al., 2012b), we believe that such a program will require a long period of refinement, similar to the evolution of training programs for the traditional white-cane during the twentieth century (Welsh, 1981).
5.3. The importance of online and virtual training
One of the key problems with the above-mentioned training programs is their cost and the effort involved in them; e.g., the personnel needed for such training efforts. We suggest that eventually current Orientation & Mobility resources could be turned to this goal but even they are currently insufficient, especially in the developing world in which most of the blind reside.
A possible solution could be to incorporate online and virtual training to the training protocols, as are currently in development for devices such as the white cane (Lahav, 2012) and novel devices such as the EyeCane (Maidenbaum et al., 2012, Maidenbaum et al., 2013). In a recent study, a virtual environment was shown to be as efficient in teaching the spatial layout of a building as more classic sessions with instructors (Merabet et al., 2012). The blind and visually impaired could connect to this online training from the comfort and safety of their own homes, using cheap smartphones or cheap computers. This will help them gain experience both in using the SSDs in various scenarios, and in practicing visual principles in dedicated lessons which would also be applicable to the newly visually restored.
Another important aspect of virtual training is the ability to add gamification elements, which can both increase user motivation, and enable implicit feedback to make sure the user is using the SSD correctly; for instance when presented with three objects, instead of describing the stimuli to an instructor which is manpower-intensive and requires explicit explanations, or answering a long series of questions about the stimuli, there could be a game where the user must bring object 1 to object 2 while avoiding object 3 – a task that requires perception and recognition of all three objects and their spatial properties achieving the same result implicitly.
5.4. Taking the step to simplified environments and consequently the real-world
Another important step involves bridging the gap between using SSDs in controlled conditions and in real world settings which are far noisier and more difficult to interpret. This level of noise, caused by the many other objects in a visually rich real environment, is highly confusing, and can produce sensory overload in the user. We suggest this gap can be bridged by a combination of several different tools. These include (1) dedicated real-world “safe environments” for training, similar to those currently used in classic orientation and mobility training. In these environments SSD users will still experience the sensory overload, but because they are safer, they can learn to deal with the excess information and focus their attention on relevant details, similar to the way children learn to filter their senses when growing up. (2) Since virtual environments are easier and cheaper to create, dedicated virtual environments can augment the real ones described above. These environments can be designed to increase gradually in difficulty. This may lead to a better learning curve, simulate a wider variety of environments, and let users practice from their own homes. (3) Input-simplification algorithms that will filter out a large part of the noise using techniques such as cartoonization or a depth camera to filter out input beyond a certain range.
5.5. The importance of active sensing
As mentioned above, most tasks performed with SSDs have been highly controlled laboratory trials. While such trials enable clearer results, they are often static, and participants thus lose the advantages of closing the sensory-motor loop (Held, 1965). Using SSDs with an active sensory-motor loop should significantly improve behavioral results. For example Ahissar et al. have shown that human subjects learning to use “whiskers” enter an iterative motor-sensory process that gradually converges on stable percepts (Horev et al., 2011, Mitchinson et al., 2007). Subjects who used the EyeMusic dynamically were able to accurately reach for an object perceived via the SSD, just as they were able to reach for it visually (Levy-Tzedek et al., 2012a). This information was even shared between the visual and substituted information (Levy-Tzedek et al., 2012b). The advantages of a more active sensory-motor paradigm were also shown to boost SSD training (Reynolds and Glenney, 2012).
5.6. Augmenting retinal prostheses and residual vision – a combined vision-rehabilitation device (VRD)
The recent advances in other approaches to visual restoration provide interesting opportunities. We recently suggested (Reich et al., 2011a) a post-operation system combining an SSD with retinal prostheses (“Vision rehabilitation device”; VRD, illustrated in Fig. 6). This system will include a camera consistently capturing images of the surroundings and a processing unit. This unit converts the visual information into (i) an auditory SSD representation and (ii) a neural stimulation conveyed by the prostheses’ electrodes. Information about the surroundings would thus be received in parallel from the prostheses as well as from the SSD. In such a device, the SSD would serve as a “sensory interpreter” providing explanatory input for the visual signal arriving from the invasive device. This dual synchronous information is expected to significantly increase the rate of rehabilitation. At a later stage the SSD can be used to provide input beyond the maximal capabilities of the prostheses or capabilities it does not possess such as adding color information via EyeMusic (Levy-Tzedek et al., 2012a). This concept is also applicable to cases of late sight-restoration such as the “Prakash” project participants (Ostrovsky et al., 2009).
Fig. 6. SSDs and VRD – Left: the three different layers of SSDs: an input sensory device, a processing unit and an output human machine interface. Right: a sample VRD combining three output modalities conveying the captured visual information using skin stimulation, auditory output and a retinal prosthesis. Copyright © 2013 Second Sight Medical Products, Inc.
This concept could be further expanded by augmenting the residual vision of low vision users, who will use their residual vision and not the prosthesis, with the visual parameters provided by the SSD. This possibility is supported by the subjective descriptions of CC, a woman who has residual eyesight and uses a visual-to-auditory SSD on a daily basis claims that her SSD perception and her visual perception share a single space, but that the SSD provides greater detail such that a more complex perception of the scenery is generated (Ward and Meijer, 2010). The findings mentioned in Section 4.3 that the substituted sense has a shared space with other senses, also make it possible to use the location information in motion even without awareness (Levy-Tzedek et al., 2012b, Wright et al., 2012). Tactile-Sight, a recent example of such a device for low-vision users, augments low vision with depth information (Cancar et al., 2013).
In a more futuristic view, the integration of special input sensors to the VRD may further expand visual augmentation, as demonstrated in a recent interesting study by Alsberg (2012) who attached a hyperspectral camera to the vOICe SSD, creating an SSD for sonifying chemical compounds, which in turn enabled him to use audition to distinguish identical looking chemical substances (sucrose vs. potato powder).
This approach, however, has its own drawbacks. For example, such information might cause sensory overload, competition between the senses, or even simply tire users faster. However, we suggest that the advantages should outweigh these potential disadvantages, especially when balanced correctly in a structured training program.
6. Conclusion and future directions
We have presented a variety of reasons to believe SSDs now have the potential to be re-applied successfully for visual rehabilitation based on recent theoretical advances and current optimistic behavioral results. We believe that the steps toward this important goal must include the creation of structured training programs, dedicated virtual and real world training environments, the development of the next generation of SSDs using state-of-the-art technology, and most importantly, the promotion of scientific efforts directly aimed at optimizing the visual rehabilitation process using SSDs, which has taken the back seat in the last few decades in favor of their use for research.
It should be made clear that the potential beneficiaries are not yet fully able to use these devices in the real world and that the devices themselves are not yet fully ripe for full use. Rather, we wish to highlight the growing possibility for such use in the near future, both as a stand alone approach but primarily when combined with other approaches such as retinal implants (VRD) or with computer vision algorithms. We believe that these devices could be exploited for much more practical use if efforts are made in this direction and have outlined several important factors we believe need to be further developed as the next steps toward this goal.
Acknowledgments
This work was supported by a European Research Council grant to A.A. (grant number 310809), The Charitable Gatsby Foundation, The James S. McDonnell Foundation scholar award (to AA; grant number 220020284), The Israel Science Foundation (grant number ISF 1684/08), The Edmond and Lily Safra Center for Brain Sciences (ELSC) Vision center grant (to AA); SA was supported by a scholarship from the Israel Ministry of Science.
ReferencesAhuja et al., 2011
A. Ahuja, J. Dorn, A. Caspi, M. McMahon, G. Dagnelie, L. Dacruz, P. Stanga, M. Humayun, R. GreenbergBlind subjects implanted with the Argus II retinal prosthesis are able to improve performance in a spatial-motor task
British Journal of Ophthalmology, 95 (2011), p. 539
CrossRefView Record in ScopusGoogle Scholar
S. Akhter, J. Mirsalahuddin, F. Marquina, S. Islam, S. SareenA Smartphone-based haptic vision substitution system for the blind
IEEE 2011 – 37th Annual Northeast Bioengineering Conference (NEBEC) (2011), pp. 1-2
CrossRefView Record in ScopusGoogle Scholar
B.K. AlsbergIs sensing spatially distributed chemical information using sensory substitution with hyperspectral imaging possible?
Chemometrics and Intelligent Laboratory Systems, 114 (2012), pp. 24-29
ArticleDownload PDFView Record in ScopusGoogle Scholar
A. Amedi, A. Floel, S. Knecht, E. Zohary, L.G. CohenTranscranial magnetic stimulation of the occipital pole interferes with verbal processing in blind subjects
Nature neuroscience, 7 (11) (2004), pp. 1266-1270
View Record in ScopusGoogle Scholar
A. Amedi, G. Jacobson, T. Hendler, R. Malach, E. ZoharyConvergence of visual and tactile shape processing in the human lateral occipital complex
Cerebral Cortex, 12 (2002), pp. 1202-1212
View Record in ScopusGoogle Scholar
A. Amedi, R. Malach, T. Hendler, S. Peled, E. ZoharyVisuo-haptic object-related activation in the ventral visual pathway
Nature Neuroscience, 4 (2001), pp. 324-330
View Record in ScopusGoogle Scholar
A. Amedi, N. Raz, H. Azulay, R. Malach, E. ZoharyCortical activity during tactile exploration of objects in blind and sighted humans
Restorative Neurology and Neuroscience, 28 (2010), pp. 143-156
CrossRefView Record in ScopusGoogle Scholar
A. Amedi, N. Raz, P. Pianka, R. Malach, E. ZoharyEarly ‘visual’cortex activation correlates with superior verbal memory performance in the blind
Nature neuroscience, 6 (7) (2003), pp. 758-766
View Record in ScopusGoogle Scholar
A. Amedi, W.M. Stern, J.A. Camprodon, F. Bermpohl, L. Merabet, S. Rotman, C. Hemond, P. Meijer, A. Pascual-LeoneShape conveyed by visual-to-auditory sensory substitution activates the lateral occipital complex
Nature Neuroscience, 10 (2007), pp. 687-689
CrossRefView Record in ScopusGoogle Scholar
M. Auvray, S. Hanneton, J.K. O ReganLearning to perceive with a visuo-auditory substitution system: localisation and object recognition with The vOICe
Perception – London, 36 (2007), p. 416
CrossRefView Record in ScopusGoogle Scholar
P. Bach-y-RitaBrain Mechanisms in Sensory Substitution
Academic Press (1972)
O. Bau, I. Poupyrev, A. Israr, C. HarrisonTeslaTouch: electrovibration for touch surfaces
Proceedings of the 23rd Annual ACM Symposium on User Interface Software and Technology, ACM, New York, NY, USA (2010), pp. 283-292
CrossRefView Record in ScopusGoogle Scholar
D. Bavelier, E.A. HirshornI see where you’re hearing: how cross-modal plasticity may exploit homologous brain structures
Nature Neuroscience, 13 (2010), p. 1309
CrossRefView Record in ScopusGoogle Scholar
M. Bedny, A. Pascual-Leone, D. Dodell-Feder, E. Fedorenko, R. SaxeLanguage processing in the occipital cortex of congenitally blind adults
Proceedings of the National Academy of Sciences USA, 108 (2011), pp. 4429-4434
CrossRefView Record in ScopusGoogle Scholar
J.R. Blum, M. Bouchard, J.R. CooperstockWhat's around me? Spatialized audio augmented reality for blind users with a Smartphone
Mobile and Ubiquitous Systems: Computing, Networking, and Services (2012), pp. 49-62
CrossRefView Record in ScopusGoogle Scholar
G. Bologna, B. Deville, T. PunSonification of color and depth in a mobility aid for blind people
Proceedings of the 16th International Conference on Auditory Display (ICAD2010), Washington, DC, USA (2010)
G. Bologna, B. Deville, T. PunBlind navigation along a sinuous path by means of the See ColOr interface
Bioinspired Applications in Artificial and Natural Computation, Springer, Berlin Heidelberg (2009), pp. 235-243
CrossRefView Record in ScopusGoogle Scholar
M. Bujacz, P. Skulimowski, P. StrumilloNaviton – a prototype mobility aid for auditory presentation of three-dimensional scenes to the visually impaired
Journal of the Audio Engineering Society, 60 (2012), pp. 696-708
View Record in ScopusGoogle Scholar
V. Busskamp, J. Duebel, D. Balya, M. Fradot, T.J. Viney, S. Siegert, A.C. Groner, E. Cabuy, V. Forster, M. SeeligerGenetic reactivation of cone photoreceptors restores visual responses in retinitis pigmentosa
|