195_Cluster_Install.txt - iEngineer Documentation

File:	195_Cluster_Install.txt
Last Modified:	2001/09/07 15:38:04
Author:	mtripets
Document Title:	Oracle Installation
195.1	General --
195.2	Contacts --
195.3	Support --
195.4	Licences --
195.5	Troubleshooting --
195.6	Installation --
195.7	OS Install
195.8	OS Modifications
195.9	Cluster Software Install
195.10	Reboot all nodes
195.11	Configure SCI Devices
195.12	Install ORCLudlm patch
195.13	Reboot all nodes
195.14	Install and configure Veritas packages and licenses on all nodes
195.15	Reboot all the nodes
195.16	Start primary cluster node
195.17	Create oracle disk groups and raw volumes for OPS
195.18	Restart cluster on all nodes
195.19	Install Oracle with the OPS option on all nodes
195.20	The following steps are for installing the Cluster software on savage and selway.
195.21	Starting and stoping the cluster
195.22	Cluster Troubleshooting
195.23	SCI interface troubleshooting

195.1   General --

        NOTE: After install, **NEVER** restart a machine without stoping the cluster node
              running on it.  The system will go down in an unstable state and require a
              good fscking before it is happy again.

        NOTE: During the install, do not comment out or otherwise secure any of the usual
              suspect services such as rsh as they are required for the OPS install.

        Any deviations from the directions provided from sun/oracle are presented here.

195.2   Contacts --

195.3   Support --
        
195.4   Licences --
        
        Sun Cluster does not expressly require licence information to install.  Details about
        Veritas licence information can be found in section 195.14 .

195.5  Troubleshooting --

        See section starting at 195.21

195.6   Installation --

        Overview:

        195.7)  Install Solaris 8 and patch
        195.8)  Modify system files and users per instructions below
        195.9)  Use scinstall to install Sun Cluster 2.2.  Specify VxVM/Cluster Volume Manager
            and OPS as features.
        195.10)  Reboot all nodes
        195.11)  Configure SCI devices
        195.12)  Install ORCLudlm patch
        195.13)  Reboot all nodes
        195.14)  Install Veritas packages and licenses on all nodes.
            Run vxinstall on all nodes for boot disk encapsulation
        195.15)  Reboot all nodes
        195.16) Start primary cluster node
        195.17) Create oracle disk groups and raw volumes for OPS database
        195.18) Restart cluster on all nodes
        195.19) Install Oracle with the OPS option on all nodes

195.7   OS Install

        Initial software install from 'boot cdrom'
        Additional selections as follows (in order of apperance):

        Set IP number from DNS 
        No IPv6
        No kerberos
        No name resolution (unless at colo)
        Select Nort America/U.S.A for local
        NO 64 bit support
        Entire Dist + OEM Support
        
        For boot disk selection, look over the selection of presented disks.
        On each of the non-SCSI controllers there will be a large number of 
        disks offered - do not select any of these.  You should see something
        like the following to identify the SCSI disks:
        
         x      [ ] c1t26d0  (17269 MB)              17269 MB
         x      [X] c2t0d0   (17269 MB) boot disk    17269 MB
         x      [ ] c2t1d0   (17269 MB)              17269 MB
         x      [ ] c4t0d0   (17269 MB)              17269 MB

        There will be one controller with two disks attached.  In this case it is the c2 device.
        Select the first disk on the controller.  If you do not know which one to select, stop 
        now and find someone who knows.
        
        For disk geometry, create a 2 GB root, and a 2 GB /tmp partition.  Leave the
        remainder of the disk space unused.  The displayed space will show 2001 MB 
        from a rounding error.
        
        
          Customize Disk: c2t0d0 -------------------------------------------------------
          Boot Device: c2t0d0s0

          Entry:                            Recommended:      MB     Minimum:      MB
        ================================================================================
          Slice  Mount Point                 Size (MB)
             0   /                                2001
             1   /tmp                             2001
             2   overlap                         17269
             3                                       0
             4                                       0
             5                                       0
             6                                       0
             7                                       0
        ================================================================================
                                 Capacity:       17269 MB
                                 
                                 
        NOTE: You will need to put in the second cdrom after the initial reboot  to complete
        the OS install.
                        Allocated:        4002 MB
                   Rounding Error:           1 MB
                             Free:       13266 MB



        After the remaining questions, the first cd will be processed and the machine will reboot.
        When asked, choose a 'root' password.  Wait a moment for the install process to continue.
        
        Choose the media (Option 1, CD) from which you will install Solaris 8.
        
        1. CD
        2. Network File System
        3. Skip
        
         Media [1]: 1
         
         You will be prompted to insert the CD for Solaris 8 (SPARC) Software 2.
         
         After you insert the CD, please press Enter.
         
         
        Enter the number corresponding to the desired selection for more
        information, or enter 2 to continue [2]: 2
        
        End of Solaris 8 Software 2 installation.

        <Press ENTER to continue> 
        
        <Press Return to reboot the system> 
        
        Once the machine has rebooted Install the Jumbo Patch.  This can be found on host cluster.

195.8          OS Modifications
        
        After the Installation edit the following files:
        
        a) /etc/hosts
        
                # Internet host table
                #
                127.0.0.1       localhost
                192.168.10.32   volga volga.iengineer.com     loghost
                192.168.10.22   neva
                192.168.10.1    gateway
                192.168.10.25   cluster
                
        b) /.profile
        
        PATH=$PATH:/usr/local/bin:/usr/ccs/bin:/opt/SUNWcluster/bin:/opt/SUNWpnm/bin:
        /opt/VRTSvmsa/bin:/etc/vx/bin:/opt/SUNWsci/bin:/opt/SUNWscid/bin:/opt/SUNWsma/bin
        export PATH
        MANPATH=$MANPATH:/opt/SUNWcluster/man:/opt/VRTSvmman/man:/opt/SUNWsma/man
        export MANPATH
        stty erase ^H
        TERM=vt100; export TERM
        
        c) /etc/hosts.equiv
        
                neva
                volga
                
        d) /etc/defaultrouter
        
                gateway
                
                
        c) /.rhosts 
        
                 # node 0
                204.152.65.33
                204.152.65.1
                204.152.65.17
                 # node 1
                204.152.65.34
                204.152.65.2
                204.152.65.18
                
                192.168.10.22
                192.168.10.32

        
        d) /etc/system

                PLEASE COPY /etc/system to /etc/system.bak before editing the
                file.  If there is a problem booting, use the "boot -a" option
                at the ok prompt and follow the directions.
        
        
                        * Begin Oracle info
                        set shmsys:shminfo_shmmax=4294967295
                        set shmsys:shminfo_shmmin=1
                        set shmsys:shminfo_shmmni=100
                        set shmsys:shminfo_shmseg=10
                        set semsys:seminfo_semmni=100
                        set semsys:seminfo_semmsl=300
                        set semsys:seminfo_semmns=600
                        set semsys:seminfo_semopm=100
                        set semsys:seminfo_semvmx=32767
                        * End Oracle info *

        e) /etc/group

                Add the following groups to /etc/group

                oinstall::830:
                dba::860:oracle

        f) Add the oracle user, and place it into the oinstall and dba groups.
           The oinstall 830 must be the primary and dba 860 the secondary.


195.9          Cluster Software Install
        
        To install the cluster software, use the Sun Cluster System Cluster Control Panel GUI
        located in /opt/SUNWcluster/bin.  This tool will allow you to connect to both machines 
        in parallel.  

        Before you do this, the cdrom on the primary must be shared with the secondary
        in such a way that /cdrom/direct will produce the same output.  The following has proven
        to be a known process: 
        
                1. Edit /etc/dfs/dfstab file
                
                        share   /cdrom/multi_suncluster_sc_2_2         

                2. Restart the NFS Server (/etc/rc3.d/S15nfs.server)
                
                3. Type shareall after restarting the nfs.server

                4. Create a direstory on secondary node mimicking the primary cdrom directory
        
                        /cdrom/multi_suncluster_sc_2_2

                5. Mount on secondary node

                        mount neva:/cdrom/multi_suncluster_sc_2_2 /cdrom/multi_suncluster_sc_2_2


        Now that the cdrom is shared, start the CCP tool as follows:

                ./ccp prodwc-db-cluster
        
        NOTE: the prodwc-db-cluster has been pre-defined on the cluster console and will allow
        you to connect to the console servers.  Do not enter any other value here.  There may be 
        a short delay while the CCP connects.

        From now on, the directions located at:

        http://docs.sun.com:80/ab2/coll.650.1/CLUSTINSTALL/@Ab2PageView/3864?Ab2Lang=C&Ab2Enc=iso-8859-1

        will be used, starting at the *Server Software* install section located aprox 25% along
        in the doc.  The install directions will be considered verbatim except where noted.  Numbers
        in the following notes reference the original document.

        1)..... Use ./ccp prodwc-db-cluster
        2)..... Set TERM=vt100; export TERM here
        3-8)... *SKIP THESE* as they have already been done in the 195.6.2 install
        9)..... no change
        10).... no change
        11).... Be sure to use automatic
        12).... Select CVM
        13).... Cluster name is prodwc-db-cluster
        14).... Potential nodes = 2, initially configured nodes = 2
        15).... SCI interface; hostname node 0 = neva, hostname node 1 = volga
                No ethernet information required
        16).... HA data services = yes, logical hosts = no
        17-27). SKIP These questions will not be offered
        28).... Select the quorum device as the first disk on the shared arrays.  In this
                case c0t0d0s2 should be the logical choice.  This is most important so if 
                there are questions, find someone who can answer you before continuing
        29).... unused
        30).... Select OPS (#10) for data services, then 12 to quit.  The documented list has
                one more entry than the software will offer
        31).... quit install
        32-33). SKIP 


195.10       Reboot all nodes

195.11       Configure SCI Devices
        
        Exhaustive documentation on the SCI cards can be found at:

        http://docs.sun.com:80/ab2/@LegacyPageView?toc=SUNWab_83_4:/safedir/space3/coll2/SUNWabha/toc/ETPCLINSTALL:Page_B-323;bt=Enterprise+Cluster+Planning+and+Installation+Manual;ps=ps/SUNWab_83_4/ETPCLINSTALL/B.Configuring_the_SCI_SBus_Card#3

        I kid you not about this url, but it is all we have ...

        The key to the install (given that the hardware is ok) is the file which defines the 
        geometry of the cluster.  In this case we have no switches, two links/cables, two 
        hosts, and four adapters.  The file link1.sc is presented for this purpose, and can
        be found on host cluster in /home/cluster/files/link1.sc.  It has been edited to fit
        the cluster as defined up until now.

        To run the configuration, select the console on neva:
        
        1) copy link1.sc to /opt/SUNWsma/bin
        
        2) run 
                ./sm_config -f link1.sc
                
        This command must be ONLY executed on the primary node.

        3) Assuming no pathological errors, there is no need to reboot.


195.12       Install ORCLudlm patch

        This patch must be applied on both nodes before veritas or oracle software are added.  

        1) Copy the patch from host cluster located in /home/oracle/ORCLudlm.tar.Z
           to /tmp and unpack

        2) Install via pkgadd

                cd /tmp; pkgadd -d . ORCLudlm

195.13       Reboot all nodes

195.14       Install and configure Veritas packages and licenses on all nodes

        Installing the Veritas software packages can most easily be done using the same
        CCP gui that was used in the initial cluster install.  You will have to change
        the mounting point on volga for both servers to see the same path.

        Be sure to use the "Veritas Foundation Products 2000-08 for Solaris" copy of
        the software.

        1) Mount the cdrom on both servers

        2) cd to /cdrom/foundation_products_2000_08_sun/Solaris_8/pkgs and run

                pkgadd -d .

        3) Select all software options.  During the install, select default paths and say
           yes for the client/server install.  PDF documentation should be most portable.

        Software licenses are required to initialize the cluster functionality required for
        shared disks.  These licenses are located on host cluster in /home/cluster/veritas_lic
        and are named based on the host they are assigned to.  To install on to the requisite host,
        just cut and paste the license information into the appropriate terminal.
        To make sure which license file is for which host, run hostid on the machines.
        In each license file there is a reference to that hostid, can't miss it.

        Example:

        1) On neva, run "vxlicense -c" and press return

        2) Using neva1.htm (cat neva1.htm) cut and paste the key

                1860 9194 5635 4989 0189 5453 1966 0

        3) Repeat this one more time for this key, and one more time for the neva2.key

        4) Repeat for volga

        There is no need to reboot at this point

        To install the packages, follow these directions:

        1) Type vxinstall
        
        2) The installer will provide information about the disk controllers that it sees.  You
           can expect to see three (two fiber channel, and one SCSI).  Hit return to continue

        3) Select *Custom Installation*.  This is very important...

        4) You will be presented with the disks attached to each controller and a menu for what you 
           want to do with them.  What you do will be based on the controller and attached drives 
           offered to you.  

           If the install program offers to "Encapsulate Boot Disk" follow the directions in (5)
           If a large number of disks are offered up from controllers c0 or c4, follow directions in (6)

        5) Encapsulate Boot Disk:

                Select 'y' as this will make the boot device a veritas volume.

                        Encapsulate Boot Disk [y,n,q,?] (default: n) y

                Do not take the default name offered.  Instead select from the following:

                        Enter disk name for  [<name>,q,?] (default: rootdisk) rne001 (neva)
                        Enter disk name for  [<name>,q,?] (default: rootdisk) rvo001 (volga)

                The second disk on the boot controller should be offered up in the following menu:

                                Installation options for controller c2
                                Menu: VolumeManager/Install/Custom/c2

                         1      Install all disks as pre-existing disks. (encapsulate)
                         2      Install all disks as new disks. (discards data on disks!)
                         3      Install one disk at a time.
                         4      Leave these disks alone.

                         ?      Display help about menu
                         ??     Display help about the menuing system
                         q      Exit from menus

                        Select an operation to perform: (2)
        
                Select two, using the name rne002 or rvo002 depending on the host.

        6) Skipping the other disks:
        
                When presented with the option from the other two controllers:

                                Installation options for controller c0
                                Menu: VolumeManager/Install/Custom/c0

                         1      Install all disks as pre-existing disks. (encapsulate)
                         2      Install all disks as new disks. (discards data on disks!)
                         3      Install one disk at a time.
                         4      Leave these disks alone.

                         ?      Display help about menu
                         ??     Display help about the menuing system
                         q      Exit from menus

                        Select an operation to perform: (4)
        
                Select *four* for both controllers.  This is most important since cluster
                will not start with the quorum device encapsulated.

        7) You should see something like:

                  The following is a summary of your choices.

                        c2t0d0  Encapsulate
                        c2t1d0  New Disk
                
                  Is this correct [y,n,q,?] (default: y)

           when the install is complete.  If there are a large number of disks slated for encapsulation
           be sure to select 'n' and go back to correct the mistake.

        8) Select reboot from the offered menu

195.15       Reboot all the nodes

        During the boot process, you should see the root partition encapsulation taking place.  After this
        happens, the machine(s) will reboot by automaticaly.

195.16       Start primary cluster node

        The (first) moment of truth!

        When the machine has booted cleanly, log in as root and enter the following command on the
        *primary* node (ie neva):

                #scadmin startcluster neva prodwc-db-cluster

        You should see several dozen lines on the console at this point as the cluster reconfigures itself
        to the new configuration.  As long as the last lines look similar to:

                prodwc-db-cluster node 0 (neva) is a cluster member
                prodwc-db-cluster node 1 (volga) is not a cluster member
                prodwc-db-cluster cluster reconf #1 finished

        Note: there will be date/time/syslog data as headers to the above information.  Since the
        data is presented as type local0.error, do not be concerned with the word error in this context.
        If this is not what you are seeing, or "quorum exited with 1 in startnode" is the last message,
        see 195.20 for troubleshooting information.


195.17       Create oracle disk groups and raw volumes for OPS

        

195.18       Restart cluster on all nodes

        See 195.20 for details




195.19       Install Oracle with the OPS option on all nodes


        The media is located in the blue Software Cabinet in 
        Brent's office.

        We are using Oracle 8.1.6
        
        *********************************************
        *       VERY IMPORTANT INFORMATION !!!!!!!!!*
        *********************************************
        
        In order for the Oracle Installer to show the Oracle Parallel Option 
        during the installation the cluster must be running on both machines.
        
        Also RSH must be enabled on both machines in order to install OPS.
        
        *********************************************
        *********************************************
                
        
        
        The /etc/system file must be backed up and have these values inserted
        
        * Begin Oracle info *
        set shmsys:shminfo_shmmax=4294967295
        set shmsys:shminfo_shmmin=1
        set shmsys:shminfo_shmmni=100
        set shmsys:shminfo_shmseg=10
        set semsys:seminfo_semmni=100
        set semsys:seminfo_semmsl=300
        set semsys:seminfo_semmns=600
        set semsys:seminfo_semopm=100
        set semsys:seminfo_semvmx=32767
        * End Oracle info *
        
        Once the media is inserted into the machine, the SunOS automatically mounts
        the cdrom.
        
        To the begin the installation
        
        Export your DISPLAY variable. If using a Windows machine you must have
        an X Windows Server running locally.
        
        cd to /cdrom/oracle8i and type
        
        ./runInstaller
        
        Choose the location where you want the software installed.
        
        Choose Oracle8i Enterprise Edition 8.1.6.0.0
        
        Select Unix group name:         oinstall
        
        As root user execute in /tmp/OraInstall/orainstRoot.sh
        
        Continue installation.
        
        From Avaliable Products choose Oracle8i Enterprise Edition 8.1.6.0.0
        
        For Installation Types choose Custom.
        
        From Available Product Components uncheck the following:
        
        1.      Oracle Time Series
        2.      Oracle Visual Information Retreival.
        3.      Oracle Advanced Security.
        4.      Oracle InterMedia.
        5.      Oracle Names
        6.      Oracle Connection Manager.
        7.      Oracle External Naming.
        8.      All the Enterprise Manager Products
        
        
        Chose the following:
        
        1.      All the JDBC Drivers.
        
        
        For Component Locations select Java Runtime Environment and press ENTER.
        
        On Cluster Node Selection Screen hit ENTER.
        
        For Priviliged Operating System Groups choose oinstall for both.
        
        For question to Create Database say NO !!!
        
        For Summary just say INSTALL and go and have 10 coffees.
        
        When asked, run as root /share/oracle/product/8.1.6/root.sh
        
        Directory to use for client and server information [/files/nsr]? ENTER
        
        Enter the first NetWorker server's name [no more]: ENTER
        
        Start NetWorker daemons at end of install [yes]? no
        
        Do you want to continue with the installation of <ORCLclnt> [y,n,?] n
        
        Do you want to continue with installation [y,n,?] n
        
        Enter the full pathname of the local bin directory: [/usr/local/bin]: ENTER
        
        
        
                
        
        
        
        
        
        
        *********************************
        *       Raw Volume Creation     *
        *********************************
        
        
        
        To run OPS the database files must reside on raw volumes.
        To create the volumes there is a script that generates all
        the necessary commands and a config file. This script and file
        are located in:
        
        S:\ops_dev\opscore\main\cluster
        
        Or locally on the machine in:
        
        /share/oracle/vm
        
        The main script called vmcmdgen takes as a parameter a config file
        called vm.neva.cfg and an output command file which list all the 
        available disks, specifies the volume names, which disks to use for 
        which volume, mirroring and striping information...etc.etc.etc.
        
        To use, type the following:
        
        vmcmdgen vm.neva.cfg vm.neva.cmd 
        
        And run the vm.neva.cmd from the command line.
        
        
        
        
        ****************************
        *     DATABASE CREATION    *
        ****************************
        
        
        Once the software is installed, log in as user ORACLE and run the following scripts
        located in S:\ops_dev\opscore\main\cluster or on the local machine in
        /share/oracle/admin/pcmprod/create. These scripts will create an
        OPS database. Once these scrips have executed successfully. All you have to do is
        start up the secondary instance on the secondary node.
        
        From the command prompt type SVRMGRL. once you get the "SVRMGR>" prompt type
        connect internal. There is no password. 
        And run in the following order.
        
        1:      @/share/oracle/admin/pcmprod/create/crdbpcmprod.sql
        2:      @/share/oracle/product/8.1.6/rdbms/admin/catalog.sql
        3:      @/share/oracle/admin/pcmprod/create/crdb2pcmprod.sql
        4:      @/share/oracle/product/8.1.6/rdbms/admin/catproc.sql
        5:      @/share/oracle/product/8.1.6/rdbms/admin/caths.sql
        6:      @/share/oracle/product/8.1.6/rdbms/admin/otrcsvr.sql
        7:      @/share/oracle/product/8.1.6/rdbms/admin/utlsampl.sql
        
        
                then log in to SVRMGRL as user SYSTEM and run
        8:      @/share/oracle/product/8.1.6/sqlplus/admin/pupbld.sql
                
                
        9:      @/share/oracle/admin/pcmprod/create/crredologs.sql
        10:     @/share/oracle/admin/pcmprod/create/crrollbacksegs.sql
        11:     @/share/oracle/product/8.1.6/rdbms/admin/catparr.sql
        
        
        The scripts in steps 1, 3, 9 and 10 are already modified to create 
        a OPS database caled PCMPROD. 
        Naturally the *.ORA scripts are instance specific and
        should be modified accordingly. Mostly the difference 
        is the instance name.
        
        ex: PCMPROD1 and PCMPROD2.
        
        
        
                
                
        
        
        
        Once all the create and admin scripts have run successfully,
        edit the following files:
        
        on primary host.....
        INITPCMPROD.ORA
        INITPCMPROD1.ORA
        
        and on secondary host.....
        INITPCMPROD.ORA 
        INITPCMPROD2.ORA
        
        These files are located in /share/oracle/admin/pcmprod/pfile
        with symbolic links in /share/oracle/product/8.1.6/dbs with the same names
        pointing back.
        The working examples are located in /share/ops_dev/opscore/main/cluster.
        
        To create symbolic links use the following command from this directory:
        
        /share/oracle/product/8.1.6/dbs
        
        ln -s /share/oracle/admin/pcmprod/pfile/initpcmprod.ora initpcmprod.ora
        
        ln -s /share/oracle/admin/pcmprod/pfile/initpcmprod1.ora initpcmprod1.ora
        
        
        
        Once both instances of OPS on both nodes are created.
        Make sure that the following parameter is in both INITPCMPROD.ORA
        files on both machines:
        
        remote_login_passwordfile = exclusive
        
        Make sure that the parameter rollback_segments contains only the
        private rollback segments or else the instances will not come up.
        
        Create the ORAPW<db_name> file, use the following command from the
        
        /share/oracle/product/8.1.6/dbs directory.
        
        orapwd file=orapw<db_name> password=<sys user password> 
        
        Once these steps are done, you are ready to startup the LISTENER
        on both instances.
        
        Start the LISTENER by typing:
        lsnrctl start
        
        on both nodes.
        
        On the primary node login as user ORACLE. Logon in to SVRMGR.
        Connect internal. Type startup. Once the instance is up.
        Do the same on the secondary node.
        
        
        
        ********************************************************************
        The following steps are for running the database in ARCHIVE LOG MODE
        ********************************************************************
        
        
        In order to start up the instances in ARCHIVE LOG MODE.
        Make sure that the following parameters are set in the INIT.ORA files on both nodes.
        
        log_archive_start = true
        log_archive_dest_1 = "location=/share/oracle/admin/pcmprod/arch"
        log_archive_format = arch_%t_%s.arc

        Also making sure that the PARALLEL_SERVER parameter is set to FALSE.
        
        Startup the primary instance in exclusive mode by issuing the following command.
        
        SVRMGR> startup mount
        ORACLE instance started.
        Total System Global Area                        210692528 bytes
        Fixed Size                                          51632 bytes
        Variable Size                                   143351808 bytes
        Database Buffers                                 67108864 bytes
        Redo Buffers                                       180224 bytes
        Database mounted.
        
        Followed by:
        
        SVRMGR> alter database archivelog;
        Statement processed.
        
        SVRMGR> alter database open;
        Statement processed.
        
        
        To verify that the DB is actually in ARCHIVE LOG MODE type the following command.
        
        SVRMGR> archive log list
        Database log mode              Archive Mode
        Automatic archival             Enabled
        Archive destination            /share/oracle/admin/edmprod1/arch
        Oldest online log sequence     5
        Next log sequence to archive   7
        Current log sequence           7
        
        
        
        
        
        
        In order to configure Net8 (previously know as SQLNet) edit the
        LISTENER.ORA and TNSNAMES.ORA to contain proper entries for the
        hosts and names of databases and instances.
        These files are located in:
        /share/oracle/product/8.1.6/network/admin
        
        The working examples
        are located in /share/ops_dev/opscore/main/cluster.
        
        
        
195.20       The following steps are for installing the Cluster software on savage and selway.
        Also for creating a OPS database on sabage and selway.
        
        
        
                                       ************************
                                       * CLUSTER INSTALLATION *
                                       ************************
                                        
                                        
        1.      Install the client on thames by running scinstall.
                ./scinstall
                        
        2.      On Software Selection Menu choose option 3. Client.
        
        
        3.      Choose automatic install mode.
        
        4.      Choose option 5 to quit client installation.
        
        5.      Start the ccp "cluster control panel" on thames located in /opt/SUNWcluster/bin.
                ./ccp dev-db-cluster, dev-db-cluster is the name of the cluster.
        
        6.      cd to /cdrom/multi_suncluster_sc_2_2/Sun_Cluster_2_2/Sol_2.8/Tools and run scinstall.
        
        7.      On Main Menu choose install option 1. Install/Upgrade Server Packages.
        
        8.      On Software Selection Menu choose option 2 Server.
        
        9.      Specify the path to the cdrom. Remember that the cdrom is shared so on one machine you
                will have to manually type out the path.
        
        10.     For Volume Manager Selection choose Cluster Volume Manager, option 1.
        
        11.     Hit ENTER when asked if CVM must be installed before OPS can be started.
        
        12.     The name of the cluster is dev-db-cluster.
        
        13.     Potential nodes = 2.
        
        14.     Initially configured nodes = 2.
        
        15.     Choose Ether not SCI.
        
        16.     Hostname of node 0 is savage.
        
        17.     What is savage's first private network interface [hme0] qfe0
        
        18.     What is savage's second private network interface [hme1] qfe1
        
        19.     What is savage's ethernet address ?     get answer from ifconfig -a
        
        20.     Hostname of node 1 is selway.
        
        21.     What is selway's first private network interface [hme0] qfe0
        
        22.     What is selway's second private network interface [hme1] qfe1
        
        23.     What is selway's ethernet address ?     get answer from ifconfig -a
        
        24.     HA data services = yes.
        
        25.     Logical hosts = no.
        
        26.     Place the quorum device on disk c2t52d0. Different disk #'s on both machines.
        
        27.     Specify the path to the cdrom again.
        
        28.     For Data Service choose #10, Sun Cluster for Oracle Parallel Server.
        
        29.     Choose automatic install mode.
        
        30.     SANITY CHECK !!!        Choose #3 to verify the installation.
        
        31.     Choose option #5 to quit the installation.
        
        32.     Reboot all nodes.
        
        33.     Create group oinstall and dba.
        
        34.     Install package ORCLudlm located in /home/oracle/cluster
        
        35.     Choose yes when asked if you want to continue with the installation.
        
        36.     Reboot all nodes.
        
        
        
        
        
        
        
        
                                       *************************
                                       * OPS Database Creation *
                                       *************************
                                       
                                       
                                       
                                       
        1.      The OPS Unix servers are savage and selway, savage being the primary node.
        
        2.      Log on to savage as user oracle (password in the usual place).
        
        3.      Source the following file .oraenv from the command line like this:
        
                {oracle@savage:155} . oraenv
                ORACLE_SID = [oracle] ? opsdev
                
                opsdev being the database sid (system identifier. ie, db name.)
                
        4.      cd to /share/oracle/admin/opsdev/pfile and comment out the following line in initopsdev.ora
        
                #remote_login_passwordfile = exclusive
        
        4.      cd to /share/oracle/admin/opsdev/create
        
        5.      Log on to SVRMGRL and type "connect internal" like this:
        
                {oracle@savage:156} svrmgrl
                
                Oracle Server Manager Release 3.1.6.0.0 - Production
                
                Copyright (c) 1997, 1999, Oracle Corporation.  All Rights Reserved.
                
                Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
                With the Partitioning and Parallel Server options
                JServer Release 8.1.6.0.0 - Production
                
                SVRMGR> connect internal
                Connected.
                SVRMGR> 
                
        6.      Then execute the db creation scripts in the following order and as the following user.
                Note that the scripts are located in different directories.
                
                ORDER IS CRUICIAL FOR SUCCESS!!!
        
                SVRMGR>connect internal
                SVRMGR>@opsdev1run.sh
                
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/catproc.sql
                
                SVRMGR>@opsdev1run1.sh
                
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/catproc.sql
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/caths.sql
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/otrcsvr.sql
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/utlsampl.sql
                
                SVRMGR>connect system/manager
                SVRMGR>@/share/oracle/product/8.1.6/sqlplus/admin/pupbld.sql
                
                SVRMGR>@opsdev1psrbs.sh
                SVRMGR>@opsdev1pslog.sh
                
                SVRMGR>connect internal
                SVRMGR>@/share/oracle/product/8.1.6/rdbms/admin/catparr.sql
                
                
        7.      Once all scripts have executed successfully. Shutdown the database on savage:
                
                SVRMGR>conect internal
                SVRMGR>shutdown immediate
                
        8.      On BOTH MACHINES cd to $ORACLE_HOME/dbs and type the following from the command line:
        
                orapwd file=orapwopsdev passwd=change_on_install
                
                This creates the Oracle password file.
                
                then comment in the following line in /share/oracle/admin/opsdev/pfile/initopsdev.ora
                
                remote_login_passwordfile = exclusive
                
        9.      First on savage login to SVRMGR and type startup. Once you see the following:
        
                SVRMGR> startup
                ORACLE instance started.
                Total System Global Area                        215617520 bytes
                Fixed Size                                          69616 bytes
                Variable Size                                   211173376 bytes
                Database Buffers                                  4194304 bytes
                Redo Buffers                                       180224 bytes
                Database mounted.
                Database opened.
                SVRMGR>
                
                Repeat the same startup steps on selway.
                
        10.     Test OPS by typing in SVRMGR the following command which creates a test table.
        
                SVRMGR>create table test (id number(1));
                Statement processed.
                
                Then on the other machine type:
                
                SVRMGR> desc test
                Column Name                    Null?    Type
                ------------------------------ -------- ----
                ID                                      NUMBER(1)
                
                
                
        If that is what you see then you're in business.
        
        
        
        

195.21       Starting and stoping the cluster

        The normal startup sequence for the cluster nodes will be primary (neva), then secondary (volga).
        Shutting down will be done in reverse of this.

        To start:

                On neva:

                # scadmin startcluster neva prodwc-db-cluster

                On volga:

                # scadmin startnode

        To stop:

                On volga:

                # scadmin stopnode

                On neva:

                # scadmin stopnode

195.22       Cluster Troubleshooting

        Troubleshooting the cluster can be complex.  To begin with we show the expected startup
        and shutdown data from the perspective of the local console.  Note that some messages
        will appear on the master console when a secondary node joins or abandons a cluster group.

        Complete examples of logs can be found on cluster in /home/cluster/startuplogs .

        Primary cluster node startup messages:

                # scadmin startcluster neva prodwc-db-cluster
                Node specified is neva
                Cluster specified is prodwc-db-cluster
                =========================== WARNING =================================
                =                     Creating a new cluster                        =
                =====================================================================
                You are attempting to start up the cluster node neva as the
                only node in a new cluster.  It is important that no other cluster
                nodes be active at this time.  If this node hears from other cluster
                nodes, this node will abort.  Other nodes may join only after this
                command has completed successfully.  Data corruption may occur if
                more than one cluster is active.

                Do you want to continue?  [yes or no] yes
                Feb 28 14:00:35 neva ID[SUNWcluster.reconf.1150]: Starting Sun Cluster: node 0 (neva) joining the prodwc-db-cluster cluster
                Starting Sun Cluster software - joining the prodwc-db-cluster cluster 
                Feb 28 14:00:46 neva ID[SUNWcluster.reconf.1200]: Reconfiguration step start started
                Feb 28 14:00:46 neva ID[SUNWcluster.sma.smad.1102]: smad: Cluster 'prodwc-db-cluster' monitoring
                Feb 28 14:00:46 neva ID[SUNWcluster.reconf.udlm.1000]: prodwc-db-cluster starting the Unix DLM.
                Feb 28 14:00:47 neva ID[vxclust]: starting start time: 02/28 14:00:47.442:  seq # 0
                Feb 28 14:00:48 neva ID[vxclust]: max nodes defined in the cluster=2
                Feb 28 14:00:48 neva ID[vxclust]: ending step start time: 02/28 14:00:48.014: 
                Feb 28 14:00:48 neva ID[SUNWcluster.reconf.1201]: Reconfiguration step start completed
                Feb 28 14:00:48 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 1 started
                Feb 28 14:00:48 neva ID[SUNWcluster.reconf.1120]: prodwc-db-cluster reconfiguration 1 started on neva
                Feb 28 14:00:48 neva ID[SUNWcluster.sma.smad.1103]: smad: Cluster 'prodwc-db-cluster' running
                Feb 28 14:00:48 neva ID[SUNWcluster.sma.monitor.6011]: net 0 is up
                Feb 28 14:00:48 neva ID[SUNWcluster.sma.monitor.6011]: net 1 is up
                Feb 28 14:00:48 neva ID[SUNWcluster.sma.smad.1030]: prodwc-db-cluster net 0 (scid0:1) selected
                Feb 28 14:00:48 neva ID[SUNWcluster.udlm.1000]: Unix DLM version (2) and SUN unix dlm library version (1): compatible.
                Feb 28 14:00:49 neva ID[SUNWcluster.reconf.quorumdev.1000]: prodwc-db-cluster reserving 0041L04LXM as quorum device
                Feb 28 14:00:55 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 1 completed
                Feb 28 14:00:55 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 2 started
                Feb 28 14:00:56 neva ID[vxclust]: starting step1 time: 02/28 14:00:56.189:  seq # 1
                Feb 28 14:00:56 neva ID[vxclust]: members 1 joiners 1 leavers 0
                Feb 28 14:00:56 neva ID[vxclust]: ending step step1 time: 02/28 14:00:56.191: 
                Feb 28 14:00:56 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 2 completed
                Feb 28 14:00:56 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 3 started
                Feb 28 14:00:56 neva ID[vxclust]: starting step2 time: 02/28 14:00:56.598:  seq # 1
                Feb 28 14:00:56 neva ID[vxclust]: CVM:MASTER=0 SELF=0
                Feb 28 14:00:56 neva ID[vxclust]: ending step step2 time: 02/28 14:00:56.600: 
                Feb 28 14:00:56 neva ID[SUNWcluster.reconf.ccd.1703]: prodwc-db-cluster starting ccdd.
                Feb 28 14:00:56 neva ID[SUNWcluster.reconf.ccd.1705]: prodwc-db-cluster starting ccdd completed.
                Feb 28 14:01:00 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 3 completed
                Feb 28 14:01:00 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 4 started
                Feb 28 14:01:00 neva ID[SUNWcluster.reconf.ccd.1706]: CCD Step 4 transition CCD_up=0
                Feb 28 14:01:01 neva ID[SUNWcluster.reconf.ccd.1707]: CCD Step 4 transition completed.
                Feb 28 14:01:02 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 4 completed
                Feb 28 14:01:02 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 5 started
                Feb 28 14:01:02 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 5 completed
                Feb 28 14:01:02 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 6 started
                Feb 28 14:01:02 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 6 completed
                Feb 28 14:01:03 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 7 started
                Feb 28 14:01:03 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 7 completed
                Feb 28 14:01:03 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 8 started
                Feb 28 14:01:03 neva ID[vxclust]: starting step3 time: 02/28 14:01:03.901:  seq # 1
                vxvm:vxconfigd: WARNING: master_send_diskid : diskid send : 982869147.1608.neva
                vxvm:vxconfigd: WARNING: master_send_diskid : diskid send : 982868779.1596.neva
                vxvm:vxconfigd: WARNING: master_send_diskid : diskid send : 982868227.1589.neva
                vxvm:vxconfigd: WARNING: master_send_diskid : diskid send : 982870851.1612.neva
                dg dbbdg usetype fsgen: start edmprodctrl01.ctl edmprodctrl02.ctl edmprodctrl03.ctl edmproddat
                a01.dbf edmprodindex01.dbf edmprodrbs01.dbf edmprodredo01_1.log edmprodredo01_2.log edmprodred
                o01_3.log edmprodredo02_1.log edmprodredo02_2.log edmprodredo02_3.log edmprodsystem01.dbf edmp
                rodtemp01.dbf edmprodtools01.dbf
                Feb 28 14:01:39 neva ID[vxclust]: ending step step3 time: 02/28 14:01:39.657: 
                Feb 28 14:01:39 neva ID[SUNWcluster.pnm.pnmd.2005]: pnmd daemon is shutting down
                VERITAS VM Storage Administrator Server terminated.
                Stopping VERITAS VM Storage Administrator Server
                Feb 28 14:01:42 neva ID[SUNWcluster.pnm.pnmd.2006]: Values for PNM tuneable parameters - inactive_time (5 s), ping_timeout (4 s), rep_test (3 time(s)), slow_network (2 s)
                Feb 28 14:01:42 neva ID[SUNWcluster.pnm.pnmd.3001]: cannot open config file
                Feb 28 14:01:43 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 8 completed
                Feb 28 14:01:43 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 9 started
                Feb 28 14:01:43 neva ID[vxclust]: starting step4 time: 02/28 14:01:43.595:  seq # 1
                Feb 28 14:01:43 neva ID[vxclust]: ending step step4 time: 02/28 14:01:43.605: 
                Feb 28 14:01:43 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 9 completed
                Feb 28 14:01:43 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 10 started
                Feb 28 14:01:50 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 10 completed
                Feb 28 14:01:50 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 11 started
                Feb 28 14:01:50 neva ID[SUNWcluster.cvm.1025]: cluster volume manager shared access mode enabled
                Feb 28 14:01:50 neva ID[SUNWcluster.cvm.1010]: - node neva vm_on_node is master
                Feb 28 14:01:51 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 11 completed
                Feb 28 14:01:51 neva ID[SUNWcluster.reconf.1200]: Reconfiguration Step 12 started
                Feb 28 14:01:52 neva ID[SUNWcluster.reconf.1201]: Reconfiguration Step 12 completed
                Feb 28 14:01:52 neva ID[SUNWcluster.clustd.1920]: prodwc-db-cluster node 0 (neva) is a cluster member
                Feb 28 14:01:52 neva ID[SUNWcluster.clustd.1930]: prodwc-db-cluster node 1 (volga) is not a cluster member
                Feb 28 14:01:52 neva ID[SUNWcluster.clustd.1940]: prodwc-db-cluster cluster reconf #1 finished
                Feb 28 14:01:52 neva netfmd[14589]: Starting up


        Primary cluster node shutdown message:

                # scadmin stopnode
                Assuming a default cluster name of prodwc-db-cluster
                Stopping the Sun Cluster software - leaving the prodwc-db-cluster cluster
                Feb 28 13:59:39 neva ID[SUNWcluster.pnm.pnmd.2005]: pnmd daemon is shutting down
                Feb 28 13:59:41 neva ID[SUNWcluster.pnm.pnmd.3001]: cannot open config file
                Feb 28 13:59:42 neva ID[vxclust]: starting stop time: 02/28 13:59:42.004:  seq # 3
                Feb 28 13:59:42 neva ID[vxclust]: stop: successfully finished
                Feb 28 13:59:42 neva ID[SUNWcluster.reconf.1200]: Reconfiguration step return started
                Feb 28 13:59:42 neva ID[vxclust]: starting return time: 02/28 13:59:42.873:  seq # 3
                Feb 28 13:59:42 neva ID[vxclust]: ending step return time: 02/28 13:59:42.874: 
                Feb 28 13:59:42 neva ID[SUNWcluster.reconf.ccd.1708]: prodwc-db-cluster returning ccdd.
                Feb 28 13:59:43 neva ID[SUNWcluster.reconf.ccd.1709]: prodwc-db-cluster returning ccdd completed.
                Feb 28 13:59:43 neva ID[SUNWcluster.sma.smad.1104]: smad: Cluster 'prodwc-db-cluster' returning
                Feb 28 13:59:43 neva ID[SUNWcluster.reconf.1200]: Reconfiguration step return completed
                Feb 28 13:59:43 neva ID[SUNWcluster.reconf.1200]: Reconfiguration step abort started
                Feb 28 13:59:44 neva ID[SUNWcluster.pnm.pnmd.2005]: pnmd daemon is shutting down
                Feb 28 13:59:46 neva ID[SUNWcluster.pnm.pnmd.3001]: cannot open config file
                Feb 28 13:59:46 neva ID[vxclust]: starting abort time: 02/28 13:59:46.657:  seq # 3
                Feb 28 13:59:46 neva ID[vxclust]: calling volcvm_abort time: 02/28 13:59:46.658: 
                Feb 28 13:59:46 neva ID[vxclust]: volcvm_abort finished time: 02/28 13:59:46.659: 
                Feb 28 13:59:46 neva ID[vxclust]: ending step abort time: 02/28 13:59:46.659: 
                Feb 28 13:59:46 neva ID[SUNWcluster.reconf.ccd.1701]: prodwc-db-cluster aborting ccdd.
                Feb 28 13:59:47 neva ID[SUNWcluster.reconf.ccd.1702]: prodwc-db-cluster aborting ccdd completed.
                Feb 28 13:59:47 neva ID[SUNWcluster.sma.smad.1105]: smad: Cluster 'prodwc-db-cluster' no longer running
                Feb 28 13:59:47 neva ID[SUNWcluster.sma.smad.5010]: prodwc-db-cluster net 0 (scid0:1) de-selected
                Feb 28 13:59:47 neva ID[SUNWcluster.reconf.1201]: Reconfiguration step abort completed
                The prodwc-db-cluster cluster has no active hosts.
                Feb 28 13:59:48 neva ID[SUNWcluster.clustd.transition.4011]: cluster stopped on this node (neva)


195.23       SCI interface troubleshooting   

        The SCI interface(s) are used as a inter-host communication device for cluster communications.
        The SCI host device uses the concept of rings in which two or more nodes (SCI SBus cards) are
        mutually connected.  In a given ring, one host device is designated as a scrbber.  Assignment
        as a scrubber is accomplished by setting the onboard scrubber jumper.

        Potential problems:
        
                1) Bad SCI board

                        A hardware error on a SCI interface will most likely betray itself in the
                        boot sequence.  The expected messages on the console should look somthing
                        like:

                        SCI Driver : version Dolphin IRM 1.9.8.1  ( 1999-10-22 ) initializing
                        SCI Driver : 32 bit mode Compiled Feb 29 2000 : 15:42:05
                        SCI Instance 0 : Awaiting configuration (Serial number is 17709 (0x452d))
                        SCI Instance 0 : adapter attached
                        SCI Instance 1 : Awaiting configuration (Serial number is 17609 (0x44c9))
                        SCI Instance 1 : adapter attached
                        .
                        .
                        SCI Instance 0 : Initializing as adapter number 0 (Serial number is 17709 (0x452d))
                        SCI Adapter 0 : NodeId is 8 ( 0x8 ) Serial no : 17709 (0x452d)
                        SCI Adapter 0 : The SCI link is not operational.
                        SCI Instance 1 : Initializing as adapter number 1 (Serial number is 17609 (0x44c9))
                        SCI Adapter 1 : NodeId is 12 ( 0xc ) Serial no : 17609 (0x44c9)
                        SCI Adapter 1 : The SCI link is not operational
                        SCI Adapter 0 : No switch found
                        SCI Adapter 0 : The SCI link is operational.
                        ID[SUNWcluster.sma.smak.1001]: SCI Adapter 0: Card operational
                        ID[SUNWcluster.sma.smak.1051]: SCI Adapter 0: Link operational
                        ID[SUNWcluster.sma.smak.1052]: SCI Adapter 0: Switch Serial Number: Not Available
                        SCI Adapter 1 : No switch found
                        SCI Adapter 1 : The SCI link is operational.
                        ID[SUNWcluster.sma.smak.1001]: SCI Adapter 1: Card operational
                        ID[SUNWcluster.sma.smak.1051]: SCI Adapter 1: Link operational
                        ID[SUNWcluster.sma.smak.1052]: SCI Adapter 1: Switch Serial Number: Not Available
                        ID[SUNWcluster.sma.smak.3001]: SCI Adapter 0 (0008): Session to 0048 active
                        ID[SUNWcluster.sma.smak.3001]: SCI Adapter 1 (000c): Session to 004c active
                        

                        Here the machine booted, found the devices and attached to the other host without
                        problem.  Note both adapters are listed as operational. 

                        Other indiactors during run time should be found in /var/adm/messages

                2) Bad interconnect

                        Testing a bad interconnect should be like any other networking problem.
                        The methodology here can be used on all the remaining problems.

                        Since the interconnect is treated like a network interface by the operating
                        system, many of the usual tools can be used for locating broken connections.

                        neva:/> ifconfig -a
                        lo0: flags=1000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4> mtu 8232 index 1
                        inet 127.0.0.1 netmask ff000000 
                        hme0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
                        inet 192.168.10.22 netmask ffffff00 broadcast 192.168.10.255
                        scid0: flags=10080c1<UP,RUNNING,NOARP,PRIVATE,IPv4> mtu 16321 index 3
                        inet 204.152.65.1 netmask fffffff0 
                        scid1: flags=10080c1<UP,RUNNING,NOARP,PRIVATE,IPv4> mtu 16321 index 4
                        inet 204.152.65.17 netmask fffffff0 


                        neva:/> netstat -r

                        Routing Table: IPv4
                          Destination           Gateway           Flags  Ref   Use   Interface
                        -------------------- -------------------- ----- ----- ------ ---------
                        204.152.65.16        204.152.65.17         U        1      0  scid1
                        204.152.65.0         204.152.65.1          U        1      0  scid0
                        192.168.10.0         neva                  U        1      1  hme0
                        224.0.0.0            neva                  U        1      0  hme0
                        default              gateway               UG       1      0  
                        localhost            localhost             UH       2      6  lo0


                        neva:/> netstat -i 
                        Name  Mtu  Net/Dest      Address        Ipkts  Ierrs Opkts  Oerrs Collis Queue 
                        lo0   8232 loopback      localhost      489    0     489    0     0      0     
                        hme0  1500 neva          neva           666    0     413    0     0      0     
                        scid0 16321204.152.65.0  204.152.65.1   0      0     103    0     0      0     
                        scid1 16321204.152.65.16 204.152.65.17  0      0     103    0     0      0     
        

                        
                                
                3) Interface plumbing error
                4) Routing error
                5) Scrubber error
-------------------------------------------------------------------------------------


Veritas Configuration

        export the DISPLAY variable.

        type the following commands on the cluster console.

        # xhost + neva
        neva being added to access control list
        # xhost + volga
        volga being added to access control list
Hosted by www.Geocities.ws
Contents